Comparison of Tree-Based Machine Learning Algorithms to Predict Reporting Behavior of Electronic Billing Machines

被引:7
|
作者
Murorunkwere, Belle Fille [1 ]
Ihirwe, Jean Felicien [2 ]
Kayijuka, Idrissa [3 ]
Nzabanita, Joseph [4 ]
Haughton, Dominique [5 ,6 ,7 ]
机构
[1] Univ Rwanda, African Ctr Excellence Data Sci, POB 4285, Kigali, Rwanda
[2] Univ lAquila, Dept Informat Engn Comp Sci & Math, I-56121 Pisa, Italy
[3] Univ Rwanda, Dept Appl Stat, POB 4285, Kigali, Rwanda
[4] Univ Rwanda, Coll Sci & Technol, Dept Math, POB 3900, Kigali, Rwanda
[5] Bentley Univ, Dept Math Sci & Global Studies, Waltham, MA 02452 USA
[6] Univ Paris 1 SAMM, Dept Math Sci & Global Studies, F-75634 Paris, France
[7] Univ Toulouse 1 TSE R, Dept Math Sci & Global Studies, F-31042 Toulouse, France
关键词
tree-based machine learning algorithms; compliance; value added tax; machine learning; electronic billing machines; reporting behavior;
D O I
10.3390/info14030140
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tax fraud is a common problem for many tax administrations, costing billions of dollars. Different tax administrations have considered several options to optimize revenue; among them, there is the so-called electronic billing machine (EBM), which aims to monitor all business transactions and, as a result, boost value added tax (VAT) revenue and compliance. Most of the current research has focused on the impact of EBMs on VAT revenue collection and compliance rather than understanding how EBM reporting behavior influences future compliance. The essential contribution of this study is that it leverages both EBM's historical reporting behavior and actual business characteristics to understand and predict the future reporting behavior of EBMs. Herein, tree-based machine learning algorithms such as decision trees, random forest, gradient boost, and XGBoost are utilized, tested, and compared for better performance. The results exhibit the robustness of the random forest model, among others, with an accuracy of 92.3%. This paper clearly presents our approach contribution with respect to existing approaches through well-defined research questions, analysis mechanisms, and constructive discussions. Once applied, we believe that our approach could ultimately help the tax-collecting agency conduct timely interventions on EBM compliance, which will help achieve the EBM objective of improving VAT compliance.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Comparison of Machine Learning Tree-Based Algorithms to Predict Future Paratuberculosis ELISA Results Using Repeat Milk Tests
    Imada, Jamie
    Arango-Sabogal, Juan Carlos
    Bauman, Cathy
    Roche, Steven
    Kelton, David
    ANIMALS, 2024, 14 (07):
  • [2] The predictability of tree-based machine learning algorithms in the big data context
    Qolipour F.
    Ghasemzadeh M.
    Mohammad-Karimi N.
    International Journal of Engineering, Transactions A: Basics, 2021, 34 (01): : 82 - 89
  • [3] Determining the Happiness Class of Countries with Tree-Based Algorithms in Machine Learning
    Dogruel, Merve
    Kara, Selin Soner
    ACTA INFOLOGICA, 2023, 7 (02): : 243 - 252
  • [4] Land subsidence modelling using tree-based machine learning algorithms
    Rahmati, Omid
    Falah, Fatemeh
    Naghibi, Seyed Amir
    Biggs, Trent
    Soltani, Milad
    Deo, Ravinesh C.
    Cerda, Artemi
    Mohammadi, Farnoush
    Dieu Tien Bui
    SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 672 : 239 - 252
  • [5] The Predictability of Tree-based Machine Learning Algorithms in the Big Data Context
    Qolipour, F.
    Ghasemzadeh, M.
    Mohammad-Karimi, N.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2021, 34 (01): : 82 - 89
  • [6] Malware Detection Method using Tree-based Machine Learning Algorithms
    Okada, Satoshi
    Matsuda, Wataru
    Fujimoto, Mariko
    Mitsunaga, Takuho
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING (ICOCO), 2021, : 103 - 108
  • [7] Comparison of Some Balancing Methods for Classification of Pacing Horses Using Tree-based Machine Learning Algorithms
    Ozen, Hullya
    Ozen, Dogukan
    Yuceer Ozkul, Banu
    Ozbeyaz, Ceyhan
    KAFKAS UNIVERSITESI VETERINER FAKULTESI DERGISI, 2024, 30 (01) : 31 - 40
  • [8] A Comparative Analysis of Tree-based Machine Learning Algorithms for Breast Cancer Detection
    A'la, Fiddin Yusfida
    Permanasari, Adhistya Erna
    Setiawan, Noor Akhmad
    PROCEEDINGS OF 2019 12TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2019, : 55 - 59
  • [9] Hybrid Iterative and Tree-Based Machine Learning Algorithms for Lake Water Level Forecasting
    Elham Fijani
    Khabat Khosravi
    Water Resources Management, 2023, 37 : 5431 - 5457
  • [10] Detection of cardiovascular disease cases using advanced tree-based machine learning algorithms
    Asadi, Fariba
    Homayounfar, Reza
    Mehrali, Yaser
    Masci, Chiara
    Talebi, Samaneh
    Zayeri, Farid
    SCIENTIFIC REPORTS, 2024, 14 (01):