DocExtractNet: A novel framework for enhanced information extraction from business documents

被引:0
|
作者
Yan, Zhengjin [1 ]
Ye, Zheng [1 ]
Ge, Jun [2 ]
Qin, Jun [1 ]
Liu, Jing [1 ]
Cheng, Yu [3 ]
Gurrin, Cathal [4 ]
机构
[1] South Cent Minzu Univ, Coll Comp Sci & Informat Phys Fus Intelligent Comp, Key Lab Natl Ethn Affairs Commiss, Wuhan, Peoples R China
[2] Wuchang Univ Technol, Sch Artificial Intelligence, Wuhan, Peoples R China
[3] Hangzhou Boyan Private Equ Fund Management Partner, Hangzhou, Peoples R China
[4] Dublin City Univ, Dublin, Ireland
关键词
Receipt information extraction; LayoutLMv3; ImageEnhance; PrecisionHints; CrossModalFusion;
D O I
10.1016/j.ipm.2024.104046
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient extraction of critical information from receipt is essential for automating financial processes and supporting timely decision-making in businesses. However, this process faces significant challenges, starting with variations in the quality of scanned receipt images due to differences in scanning equipment, followed by the complexity of diverse receipt formats, and further complicated by handwritten elements and noise, making accurate extraction particularly difficult. Therefore, to address these issues, we propose a model framework called DocExtractNet, based on LayoutLMv3, designed for extracting key information from receipt. Firstly, we introduce the ImageEnhance method to process image modality features, enhancing image clarity and significantly improving recognition accuracy for low-quality images. Then, we implement the PrecisionHints strategy to supplement missing key-value pairs in the text modality, improving data integrity and the model's overall performance. Furthermore, we apply the CrossModalFusion method to combine both image and text features, allowing the model to better understand and extract receipt information. The experimental results on the Finance- Receipts, FUNSD, and CORD datasets show that DocExtractNet significantly improves F1 scores compared to other models, with F1 scores reaching 97.07% for Finance-Receipts, 91.80% for FUNSD, and 97.38% for CORD, highlighting its superior performance in receipt information extraction.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Information extraction from free-text business documents
    Abramowicz, W
    Piskorski, J
    ISSUES AND TRENDS OF INFORMATION TECHNOLOGY MANAGEMENT IN CONTEMPORARY ORGANIZATIONS, VOLS 1 AND 2, 2002, : 626 - 630
  • [2] Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents
    Nguyen Hong Son
    Hieu M Yu
    Tuan-Anh D Nguyen
    Minh-Tien Nguyen
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [3] Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents
    Son, Nguyen Hong
    Yu, Hieu M.
    Nguyen, Tuan-Anh D.
    Nguyen, Minh-Tien
    Proceedings of the International Joint Conference on Neural Networks, 2022, 2022-July
  • [4] Extraction of chemical information from documents
    Villar, Hugo O.
    Betancort, Juan
    Hansen, Mark R.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2010, 240
  • [5] Information Extraction from Legal Documents
    Cheng, Tin Tin
    Cua, Jeffrey Leonard
    Tan, Mark Davies
    Yao, Kenneth Gerard
    Roxas, Rachel Edita
    2009 EIGHTH INTERNATIONAL SYMPOSIUM ON NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2009, : 157 - +
  • [6] Cooperative and Fast-Learning Information Extraction from Business Documents for Document Archiving
    Esser, Daniel
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2013 WORKSHOPS, 2013, 8186 : 22 - 31
  • [7] Information Extraction of Domain-specific Business Documents with Limited Data
    Minh-Tien Nguyen
    Le Thai Linh
    Dung Tien Le
    Nguyen Hong Son
    Do Hoang Thai Duong
    Bui Cong Minh
    Akira Shojiguchi
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [8] Semantic information extraction from Tamil documents
    Pandian, S. Lakshmana
    Devakumar, J.
    Geetha, T.V.
    International Journal of Metadata, Semantics and Ontologies, 2008, 3 (03) : 226 - 232
  • [9] Information Extraction from Arabic Law Documents
    Abu Shamma, Samah
    Ayasa, Aseel
    Sleem, Wala'
    Yahya, Adnan
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2020), 2020,
  • [10] Information Extraction from Chinese Judgment Documents
    Zhuang, Chuhan
    Zhou, Yemao
    Ge, Jidong
    Li, Zhongjin
    Li, Chuanyi
    Zhou, Xiaoyu
    Luo, Bin
    2017 14TH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE (WISA 2017), 2017, : 240 - 244