Automatic Electronic Invoice Classification Using Machine Learning Models

被引:9
|
作者
Bardelli, Chiara [1 ]
Rondinelli, Alessandro [2 ]
Vecchio, Ruggero [2 ]
Figini, Silvia [3 ]
机构
[1] Univ Pavia, Dept Computat Math & Decis Sci, I-27100 Pavia, Italy
[2] Datevit Spa, I-20090 Assago, Italy
[3] Univ Pavia, Dept Polit & Social Sci, I-27100 Pavia, Italy
来源
关键词
multiclass classification; text mining; accounting control system; ALGORITHMS;
D O I
10.3390/make2040033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Electronic invoicing has been mandatory for Italian companies since January 2019. All the invoices are structured in a predefined xml template which facilitates the extraction of the information. The main aim of this paper is to exploit the information contained in electronic invoices to build an intelligent system which can simplify accountants' work. More precisely, this contribution shows how it is possible to automate part of the accounting process: all the invoices of a company are classified into specific codes which represent the economic nature of the financial transactions. To accomplish this classification task, a multiclass classification algorithm is proposed to predict two different target variables, the account and the VAT codes, which are part of the general ledger entry. To apply this model to real datasets, a multi-step procedure is proposed: first, a matching algorithm is used for the reconstruction of the training set, then input data are elaborated and prepared for the training phase, and finally a classification algorithm is trained. Different classification algorithms are compared in terms of prediction accuracy, including ensemble models and neural networks. The models under comparison show optimal results in the prediction of the target variables, meaning that machine learning classifiers succeed in translating the complex rules of the accounting process into an automated model. A final study suggests that best performances can be achieved considering the hierarchical structure of the account codes, splitting the classification task into smaller sub-problems.
引用
收藏
页码:617 / 629
页数:13
相关论文
共 50 条
  • [1] Invoice Classification Using Deep Features and Machine Learning Techniques
    Tarawneh, Ahmad S.
    Hassanat, Ahmad B.
    Chetverikov, Dmitry
    Lendak, Imre
    Verma, Chaman
    2019 IEEE JORDAN INTERNATIONAL JOINT CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY (JEEIT), 2019, : 855 - 859
  • [2] Anomaly detection in electronic invoice systems based on machine learning
    Tang, Peng
    Qiu, Weidong
    Huang, Zheng
    Chen, Shuang
    Yan, Min
    Lian, Huijuan
    Li, Zhe
    INFORMATION SCIENCES, 2020, 535 : 172 - 186
  • [3] Automatic flow classification using machine learning
    Anantavrasilp, Isara
    Schoeler, Thorsten
    SOFTCOM 2007: 15TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS, 2007, : 390 - +
  • [4] Automatic Vulnerability Classification Using Machine Learning
    Gawron, Marian
    Cheng, Feng
    Meinel, Christoph
    RISKS AND SECURITY OF INTERNET AND SYSTEMS, CRISIS 2017, 2018, 10694 : 3 - 17
  • [5] Automatic Patents Classification Using Supervised Machine Learning
    Shahid, Muhammad
    Ahmed, Adeel
    Mushtaq, Muhammad Faheem
    Ullah, Saleem
    Matiullah
    Akram, Urooj
    RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING (SCDM 2020), 2020, 978 : 297 - 307
  • [6] Automatic tortuosity classification using machine learning approach
    Turior, Rashmi
    Chutinantvarodom, Pornthep
    Uyyanonvara, Bunyarit
    INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS, PTS 1-4, 2013, 241-244 : 3143 - 3147
  • [7] Automatic classification of object code using machine learning
    Clemens, John
    DIGITAL INVESTIGATION, 2015, 14 : S156 - S162
  • [8] Company Classification Using Machine Learning Models
    Kovarik, Martin
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON FINANCE AND ECONOMICS (ICFE 2017), 2017, : 311 - 325
  • [9] Automatic classification of the footprints of so-called "Andalusian" houses using Machine Learning models
    Ben Zid, Afef
    Najjar, Asma
    Hamrouni, Imen
    SCAN 2024 - 11E SEMINAIRE DE CONCEPTION ARCHITECTURALE NUMERIQUE AI & ARCHITECTURE, 2024, 203
  • [10] Automatic Classification of Vulnerabilities using Deep Learning and Machine Learning Algorithms
    Ramesh, Vishnu
    Abraham, Sara
    Vinod, P.
    Mohamed, Isham
    Visaggio, Corrado A.
    Laudanna, Sonia
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,