Topic Discovery and Topic-Driven Clustering for Audit Method Datasets

被引:0
|
作者
Zhao, Ying [1 ]
Fu, Wanyu [1 ]
Huang, Shaobin [2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Harbin Engn Univ, Coll Comp Sci Technol, Harbin 150001, Peoples R China
基金
美国国家科学基金会;
关键词
topic-driven clustering; audit methods; topic discovery;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the promotion of China's Golden Auditing Project and the fast growth of on-line auditing, there are thousands of new computer audit methods emerged every year to fulfill various needs of audit practices. How to organize these existing computer audit methods and use them intelligently have become a fundamental and challenging problem. In this paper, we propose to use topic-driven clustering methods to organize computer audit methods according to the system of computer audit methods that is issued by the National Audit Office of China. We also apply Latent Dirichlet allocation (LDA) analysis to audit method datasets at different levels of granularity. Our experimental results on social insurance computer audit methods show that the topic-driven clustering scheme with topics created by domain experts is the overall best scheme. It achieved an average purity of 0.862 across the datasets. Topics discovered by LDA were consistent with classes defined in the taxonomy for four out of five datasets, and they were effective when used in the topic-driven clustering scheme.
引用
收藏
页码:346 / +
页数:3
相关论文
共 50 条
  • [1] Topic-driven Clustering for Document Datasets
    Zhao, Ying
    Karypis, George
    PROCEEDINGS OF THE FIFTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2005, : 358 - 369
  • [2] Topic-Driven Testing
    Rau, Andreas
    PROCEEDINGS OF THE 2017 IEEE/ACM 39TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C 2017), 2017, : 409 - 412
  • [3] Developing a topic-driven method for interdisciplinarity analysis
    Kim, Hyeyoung
    Park, Hyelin
    Song, Min
    JOURNAL OF INFORMETRICS, 2022, 16 (02)
  • [4] Topic-Driven Environmental Rhetoric
    Martinez, Diane
    TECHNICAL COMMUNICATION, 2018, 65 (02) : 230 - 231
  • [5] Topic-Driven Environmental Rhetoric
    Lundgren, Zachary
    TECHNICAL COMMUNICATION QUARTERLY, 2020, 29 (01) : 103 - 106
  • [6] The design and implementation of a topic-driven crawler
    Li, Qiong
    Jin, Tao
    Fu, Yuchen
    Liu, Quan
    Cui, Zhiming
    IITA 2007: WORKSHOP ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, PROCEEDINGS, 2007, : 153 - 156
  • [7] An incremental approach to link evaluation in topic-driven Web resource discovery
    Zhang, HX
    Huang, ST
    ALGORITHMIC APPLICATIONS IN MANAGEMENT, PROCEEDINGS, 2005, 3521 : 301 - 310
  • [8] Topic discovery method based on topic model combined with hierarchical clustering
    Wang, An
    Zhang, Junjie
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 814 - 818
  • [9] A Semi-supervised Topic-Driven Approach for Clustering Textual Answers to Survey Questions
    Yang, Hui
    Mysore, Ajay
    Wallace, Sharonda
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2009, 5678 : 374 - +
  • [10] VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
    Zhang, Yuji
    Li, Jing
    Li, Wenjie
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3340 - 3354