Identifying financial statement fraud with decision rules obtained from Modified Random Forest

被引:23
|
作者
An, Byungdae [1 ]
Suh, Yongmoo [1 ]
机构
[1] Korea Univ, MIS, Sch Business, Seoul, South Korea
关键词
Financial statement fraud; Random forest; Decision rules; Feature importance; Machine learning; Predictive model; DATA MINING TECHNIQUES; INFORMATION ASYMMETRY; CORPORATE GOVERNANCE; MANAGEMENT; TREE; CLASSIFICATION; COMPLEXITY;
D O I
10.1108/DTA-11-2019-0208
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose Financial statement fraud (FSF) committed by companies implies the current status of the companies may not be healthy. As such, it is important to detect FSF, since such companies tend to conceal bad information, which causes a great loss to various stakeholders. Thus, the objective of the paper is to propose a novel approach to building a classification model to identify FSF, which shows high classification performance and from which human-readable rules are extracted to explain why a company is likely to commit FSF. Design/methodology/approach Having prepared multiple sub-datasets to cope with class imbalance problem, we build a set of decision trees for each sub-dataset; select a subset of the set as a model for the sub-dataset by removing the tree, each of whose performance is less than the average accuracy of all trees in the set; and then select one such model which shows the best accuracy among the models. We call the resulting model MRF (Modified Random Forest). Given a new instance, we extract rules from the MRF model to explain whether the company corresponding to the new instance is likely to commit FSF or not. Findings Experimental results show that MRF classifier outperformed the benchmark models. The results also revealed that all the variables related to profit belong to the set of the most important indicators to FSF and that two new variables related to gross profit which were unapprised in previous studies on FSF were identified. Originality/value This study proposed a method of building a classification model which shows the outstanding performance and provides decision rules that can be used to explain the classification results. In addition, a new way to resolve the class imbalance problem was suggested in this paper.
引用
下载
收藏
页码:235 / 255
页数:21
相关论文
共 50 条
  • [21] Understanding investor perceptions of financial statement fraud and their use of red flags: evidence from the field
    Joseph F. Brazel
    Keith L. Jones
    Jane Thayer
    Rick C. Warne
    Review of Accounting Studies, 2015, 20 : 1373 - 1406
  • [22] Understanding investor perceptions of financial statement fraud and their use of red flags: evidence from the field
    Brazel, Joseph F.
    Jones, Keith L.
    Thayer, Jane
    Warne, Rick C.
    REVIEW OF ACCOUNTING STUDIES, 2015, 20 (04) : 1373 - 1406
  • [23] Financial Data Anomaly Detection Method Based on Decision Tree and Random Forest Algorithm
    Zhang, Qingyang
    JOURNAL OF MATHEMATICS, 2022, 2022
  • [24] Identifying Key Variables Explaining the Profit of Construction Companies from Financial Statement Data
    Seo, Wonkyoung
    Kim, Byungil
    Bang, Seongdeok
    Kang, Youngcheol
    COMPUTING IN CIVIL ENGINEERING 2021, 2022, : 843 - 850
  • [25] Extracting Interpretable Decision Tree Ensemble from Random Forest
    Gulowaty, Bogdan
    Wozniak, Michal
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [26] From Random Forest to an interpretable decision tree - An evolutionary approach
    Jurczuk, Krzysztof
    Czajkowski, Marcin
    Kretowski, Marek
    PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION, 2023, : 291 - 294
  • [27] An architectural proposal for the interactive publication of the data classification obtained through a Differentially Private Random Decision Forest
    Pereira, Rosinei Cristiano
    Lopes, Fabio Silva
    2019 XLV LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2019), 2019,
  • [28] ExtractingRuleRF in Educational Data Classification: from a Random Forest to Interpretable Refined Rules
    Lu Thi Kim Phung
    Vo Thi Ngoc Chau
    Nguyen Hua Phung
    2015 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND APPLICATIONS (ACOMP), 2015, : 20 - 27
  • [29] Identifying Ly α emitter candidates with Random Forest: Learning from galaxies in the CANDELS survey
    Napolitano L.
    Pentericci L.
    Calabrò A.
    Santini P.
    Castellano M.
    Cassata P.
    Fynbo J.P.U.
    Jung I.
    Kashino D.
    Mascia S.
    Mignoli M.
    Astronomy and Astrophysics, 2023, 677
  • [30] Influencing factors of the risk correlation of financial institutions: Evidence from random forest fusion
    Li J.
    Guo X.
    Xie Q.
    Zheng X.
    Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice, 2024, 44 (01): : 296 - 315