Predicting accounting fraud using imbalanced ensemble learning classifiers - evidence from China

被引:8
|
作者
Rahman, Md Jahidur [1 ]
Zhu, Hongtao [2 ]
机构
[1] Wenzhou Kean Univ, Wenzhou, Peoples R China
[2] Univ Edinburgh, Edinburgh, Scotland
来源
ACCOUNTING AND FINANCE | 2023年 / 63卷 / 03期
关键词
Accounting fraud detection; Artificial intelligence; China A-share; CUSBoost; Ensemble learning algorithms; Machine learning; RUSBoost; FINANCIAL STATEMENT FRAUD; BANKRUPTCY PREDICTION; DECISION TREE; MACHINE; CLASSIFICATION; COMPENSATION; GOVERNANCE; REGRESSION; ARTICLE; FUSION;
D O I
10.1111/acfi.13044
中图分类号
F8 [财政、金融];
学科分类号
0202 ;
摘要
The current research aims to launch effective accounting fraud detection models using imbalanced ensemble learning algorithms for China A-Share listed firms. Based on a sample of 33,544 Chinese firm-year instances from 1998 to 2017, this research respectively established one logistic regression and four ensemble learning classifiers (AdaBoost, XGBoost, CUSBoost, and RUSBoost) by 12 financial ratios and 28 raw financial data. Additionally, we divided the sample into the train and test observations to evaluate the classifiers' out-of-sample performance. In detail, we applied two metrics, namely, Area under the ROC (receiver operating characteristic) curve (AUC) and Area under the Precision-Recall curve (AUPR), to evaluate classifiers' discriminability. In the supplement test, this study put forward an algebraic fused model on the basis of the four ensemble learning classifiers and introduced the sliding window technique. The empirical results showed that the ensemble learning classifiers can detect accounting fraud for the imbalanced China A-listed firms far more effectively than the logistic regression model. Moreover, imbalanced ensemble learning classifiers (CUSBoost and RUSBoost) effectively performed better than the common ensemble learning models (AdaBoost and XGBoost) in average. The algebraic fused model in the supplement test also obtained the highest average AUC and AUPR among all the employed algorithms. Our results offer firm support for the potential role of Machine Learning (ML)-based Artificial Intelligence (AI) approaches in reliably predicting accounting fraud with high accuracy. Similarly, for the Chinese settings, our ML-based AI offers utmost advantage in forecasting accounting fraud. Finally, this paper fills the research gap on the applications of imbalanced ensemble learning in accounting fraud detection for Chinese listed firms.
引用
收藏
页码:3455 / 3486
页数:32
相关论文
共 50 条
  • [31] Predicting Infectious Diseases by Using Machine Learning Classifiers
    Gomez-Pulido, Juan A.
    Romero-Muelas, Jose M.
    Gomez-Pulido, Jose M.
    Castillo Sequera, Jose L.
    Sanz Moreno, Jose
    Polo-Luque, Maria-Luz
    Garces-Jimenez, Alberto
    [J]. BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2020), 2020, 12108 : 590 - 599
  • [32] Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning
    Kennedy, Robert K. L.
    Salekshahrezaee, Zahra
    Villanustre, Flavio
    Khoshgoftaar, Taghi M.
    [J]. JOURNAL OF BIG DATA, 2023, 10 (01)
  • [33] Iterative cleaning and learning of big highly-imbalanced fraud data using unsupervised learning
    Robert K. L. Kennedy
    Zahra Salekshahrezaee
    Flavio Villanustre
    Taghi M. Khoshgoftaar
    [J]. Journal of Big Data, 10
  • [34] The effectiveness of ensemble learning based on four different classifiers for predicting membrane protein types
    Li, Mingyuan
    Wang, Shunfang
    Guo, Lei
    [J]. PROCEEDINGS OF THE 30TH CHINESE CONTROL AND DECISION CONFERENCE (2018 CCDC), 2018, : 5829 - 5834
  • [35] Using Graph-Based Ensemble Learning to Classify Imbalanced Data
    Qin, Anyong
    Shang, Zhaowei
    Tian, Jinyu
    Zhang, Taiping
    Wang, Yulong
    Tang, Yuan Yan
    [J]. 2017 3RD IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS (CYBCONF), 2017, : 265 - 270
  • [36] Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction
    Malhotra, Ruchika
    Jain, Juhi
    [J]. PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 300 - 304
  • [37] Types of minority class examples and their influence on learning classifiers from imbalanced data
    Krystyna Napierala
    Jerzy Stefanowski
    [J]. Journal of Intelligent Information Systems, 2016, 46 : 563 - 597
  • [38] Types of minority class examples and their influence on learning classifiers from imbalanced data
    Napierala, Krystyna
    Stefanowski, Jerzy
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2016, 46 (03) : 563 - 597
  • [39] Learning classifiers from imbalanced data based on biased minimax probability machine
    Huang, KZ
    Yang, HQ
    King, I
    Lyu, MR
    [J]. PROCEEDINGS OF THE 2004 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 2, 2004, : 558 - 563
  • [40] Predicting financial distress using machine learning approaches: Evidence China
    Rahman, Md Jahidur
    Zhu, Hongtao
    [J]. JOURNAL OF CONTEMPORARY ACCOUNTING & ECONOMICS, 2024, 20 (01)