Predicting tax fraud using supervised machine learning approach

被引:1
|
作者
Murorunkwere, Belle Fille [1 ]
Haughton, Dominique [2 ]
Nzabanita, Joseph [3 ]
Kipkogei, Francis [4 ]
Kabano, Ignace [5 ]
机构
[1] Univ Rwanda, African Ctr Excellence Data Sci, Rwanda Revenue Author, Kigali, Rwanda
[2] Univ Toulouse TSE R 1, Univ Paris 1 SAMM, Toulouse, France
[3] Univ Rwanda, Coll Sci & Technol, Sch Sci, Kigali, Rwanda
[4] Stepwise Inc, Zalda, Nairobi, Kenya
[5] Univ Rwanda, Coll Business & Econ, African Ctr Excellence Data Sci, Kigali, Rwanda
关键词
tax fraud; fraud detection; features importance; supervised machine-learning models; evaluation metrics;
D O I
10.1080/20421338.2023.2187930
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the advancement in technology, the tax base in Rwanda has become broader, and as a result, tax fraud is growing. Depending on the dataset used, fraud detection experts and researchers have used different methods to identify questionable cases. This paper aims to predict features of tax fraud using the most robust supervised machine-learning model. This research provides a context where a fraud expert can use a machine-learning model, and an implemented model offers instant feedback to the fraud expert. We evaluate supervised machine learning models such as Artificial Neural Network, Logistic Regression, Decision Tree, Random Forest, GaussianNB and XGBoost. Based on different evaluation metrics, Artificial Neural Network was the most robust model for predicting tax fraud. Findings reveal that the time of business that indicates the difference in time from when a business started and the time it was audited, the domestic businesses, taxpayers who import and export goods, those with no losses, those whose businesses are located in the eastern province, and those registered on withholding and Value Added Tax types are more susceptible to tax fraud. This study is among the few to evaluate the effectiveness of multiple supervised machine-learning models for identifying tax fraud factors on an accurate data set with numerous tax types. The evidence generated in the current study will serve as a valuable tool for both tax policymakers and auditors, as well as for enhancing awareness of more robust methods for predicting tax fraud.
引用
收藏
页码:731 / 742
页数:12
相关论文
共 50 条
  • [1] Detecting insurance fraud using supervised and unsupervised machine learning
    Debener, Joern
    Heinke, Volker
    Kriebel, Johannes
    [J]. JOURNAL OF RISK AND INSURANCE, 2023, 90 (03) : 743 - 768
  • [2] Predicting Fraud Victimization Using Classical Machine Learning
    Lokanan, Mark
    Liu, Susan
    [J]. ENTROPY, 2021, 23 (03) : 1 - 19
  • [3] A Multi-Module Machine Learning Approach to Detect Tax Fraud
    Alsadhan, N.
    [J]. Computer Systems Science and Engineering, 2023, 46 (01): : 241 - 253
  • [4] Tax Fraud Detection for Under-Reporting Declarations Using an Unsupervised Machine Learning Approach
    de Roux, Daniel
    Perez, Boris
    Moreno, Andres
    del Pilar Villamil, Maria
    Figueroa, Cesar
    [J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 215 - 222
  • [5] Predicting cancer using supervised machine learning: Mesothelioma
    Choudhury, Avishek
    [J]. TECHNOLOGY AND HEALTH CARE, 2021, 29 (01) : 45 - 58
  • [6] Cyber Fraud Prediction with Supervised Machine Learning Techniques
    Li, Zhoulin
    Zhang, Hao
    Masum, Mohammad
    Shahriar, Hossain
    Haddad, Hisham
    [J]. ACMSE 2020: PROCEEDINGS OF THE 2020 ACM SOUTHEAST CONFERENCE, 2020, : 176 - 180
  • [7] Predicting cash holdings using supervised machine learning algorithms
    Ozlem, Sirin
    Tan, Omer Faruk
    [J]. FINANCIAL INNOVATION, 2022, 8 (01)
  • [8] Predicting declining and growing occupations using supervised machine learning
    Khalaf, Christelle
    Michaud, Gilbert
    Jolley, G. Jason
    [J]. JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2023, 6 (02): : 757 - 780
  • [9] Predicting cash holdings using supervised machine learning algorithms
    Şirin Özlem
    Omer Faruk Tan
    [J]. Financial Innovation, 8
  • [10] Predicting the Political Polarity of Tweets Using Supervised Machine Learning
    Voong, Michelle
    Gunda, Keerthana
    Gokhale, Swapna S.
    [J]. 2020 IEEE 44TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2020), 2020, : 1707 - 1712