Comparison of Tree-Based Machine Learning Algorithms to Predict Reporting Behavior of Electronic Billing Machines

被引:7
|
作者
Murorunkwere, Belle Fille [1 ]
Ihirwe, Jean Felicien [2 ]
Kayijuka, Idrissa [3 ]
Nzabanita, Joseph [4 ]
Haughton, Dominique [5 ,6 ,7 ]
机构
[1] Univ Rwanda, African Ctr Excellence Data Sci, POB 4285, Kigali, Rwanda
[2] Univ lAquila, Dept Informat Engn Comp Sci & Math, I-56121 Pisa, Italy
[3] Univ Rwanda, Dept Appl Stat, POB 4285, Kigali, Rwanda
[4] Univ Rwanda, Coll Sci & Technol, Dept Math, POB 3900, Kigali, Rwanda
[5] Bentley Univ, Dept Math Sci & Global Studies, Waltham, MA 02452 USA
[6] Univ Paris 1 SAMM, Dept Math Sci & Global Studies, F-75634 Paris, France
[7] Univ Toulouse 1 TSE R, Dept Math Sci & Global Studies, F-31042 Toulouse, France
关键词
tree-based machine learning algorithms; compliance; value added tax; machine learning; electronic billing machines; reporting behavior;
D O I
10.3390/info14030140
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tax fraud is a common problem for many tax administrations, costing billions of dollars. Different tax administrations have considered several options to optimize revenue; among them, there is the so-called electronic billing machine (EBM), which aims to monitor all business transactions and, as a result, boost value added tax (VAT) revenue and compliance. Most of the current research has focused on the impact of EBMs on VAT revenue collection and compliance rather than understanding how EBM reporting behavior influences future compliance. The essential contribution of this study is that it leverages both EBM's historical reporting behavior and actual business characteristics to understand and predict the future reporting behavior of EBMs. Herein, tree-based machine learning algorithms such as decision trees, random forest, gradient boost, and XGBoost are utilized, tested, and compared for better performance. The results exhibit the robustness of the random forest model, among others, with an accuracy of 92.3%. This paper clearly presents our approach contribution with respect to existing approaches through well-defined research questions, analysis mechanisms, and constructive discussions. Once applied, we believe that our approach could ultimately help the tax-collecting agency conduct timely interventions on EBM compliance, which will help achieve the EBM objective of improving VAT compliance.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Tree-based machine learning algorithms in the Internet of Things environment for multivariate flood status prediction
    Aswad, Firas Mohammed
    Kareem, Ali Noori
    Khudhur, Ahmed Mahmood
    Khalaf, Bashar Ahmed
    Mostafa, Salama A.
    JOURNAL OF INTELLIGENT SYSTEMS, 2022, 31 (01) : 1 - 14
  • [22] Comparison of the Tree-Based Machine Learning Algorithms to Cox Regression in Predicting the Survival of Oral and Pharyngeal Cancers: Analyses Based on SEER Database
    Du, Mi
    Haag, Dandara G.
    Lynch, John W.
    Mittinty, Murthy N.
    CANCERS, 2020, 12 (10) : 1 - 16
  • [23] Pixel-wise classification in graphene-detection with tree-based machine learning algorithms
    Cho, Woon Hyung
    Shin, Jiseon
    Kim, Young Duck
    Jung, George J.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2022, 3 (04):
  • [24] Tree-based Machine Learning Methods for Survey Research
    Kern, Christoph
    Klausch, Thomas
    Kreuter, Frauke
    SURVEY RESEARCH METHODS, 2019, 13 (01): : 73 - 93
  • [25] Cosmic string detection with tree-based machine learning
    Sadr, A. Vafaei
    Farhang, M.
    Movahed, S. M. S.
    Bassett, B.
    Kunz, M.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2018, 478 (01) : 1132 - 1140
  • [26] Comparison of tree-based ensemble learning algorithms for landslide susceptibility mapping in Murgul (Artvin), Turkey
    Usta, Ziya
    Akinci, Halil
    Akin, Alper Tunga
    EARTH SCIENCE INFORMATICS, 2024, 17 (02) : 1459 - 1481
  • [27] Optimization of Tree-Based Machine Learning Models to Predict the Length of Hospital Stay Using Genetic Algorithm
    Mansoori A.
    Zeinalnezhad M.
    Nazarimanesh L.
    Journal of Healthcare Engineering, 2023, 2023
  • [28] Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data
    Uddin, Shahadat
    Lu, Haohui
    PLOS ONE, 2024, 19 (04):
  • [29] Iceberg-seabed interaction evaluation in clay seabed using tree-based machine learning algorithms
    Azimi, Hamed
    Shiri, Hodjat
    Mahdianpari, Masoud
    JOURNAL OF PIPELINE SCIENCE AND ENGINEERING, 2022, 2 (04):
  • [30] Intrusion Detection and Identification Using Tree-Based Machine Learning Algorithms on DCS Network in the Oil Refinery
    Kim, Kyoung Ho
    Kwak, Byung Il
    Han, Mee Lan
    Kim, Huy Kang
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2022, 37 (06) : 4673 - 4682