Are End-to-End Systems Really Necessary for NER on Handwritten Document Images?

被引:4
|
作者
Tueselmann, Oliver [1 ]
Wolf, Fabian [1 ]
Fink, Gernot A. [1 ]
机构
[1] TU Dortmund Univ, Dept Comp Sci, D-44227 Dortmund, Germany
关键词
Named entity recognition; Document image analysis; Information retrieval; Handwritten documents; RECOGNITION;
D O I
10.1007/978-3-030-86331-9_52
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named entities (NEs) are fundamental in the extraction of information from text. The recognition and classification of these entities into predefined categories is called Named Entity Recognition (NER) and plays a major role in Natural Language Processing. However, only a few works consider this task with respect to the document image domain. The approaches are either based on a two-stage or end-to-end architecture. A two-stage approach transforms the document image into a textual representation and determines the NEs using a textual NER. The end-to-end approach, on the other hand, avoids the explicit recognition step at text level and determines the NEs directly on image level. Current approaches that try to tackle the task of NER on segmented word images use end-to-end architectures. This is motivated by the assumption that handwriting recognition is too erroneous to allow for an effective application of textual NLP methods. In this work, we present a two-stage approach and compare it against state-of-the-art end-to-end approaches. Due to the lack of datasets and evaluation protocols, such a comparison is currently difficult. Therefore, we manually annotated the known IAM and George Washington datasets with NE labels and publish them along with optimized splits and an evaluation protocol. Our experiments show, contrary to the common belief, that a two-stage model can achieve higher scores on all tested datasets.
引用
收藏
页码:808 / 822
页数:15
相关论文
共 50 条
  • [21] End-to-End Machine Learning Solution for Recognizing Handwritten Arabic Documents
    Shtaiwi, Reem E.
    Abandah, Gheith A.
    Sawalhah, Safaa A.
    2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, : 180 - 185
  • [22] Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
    Bluche, Theodore
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [23] End-to-End page-Level assessment of handwritten text recognition
    Vidal, Enrique
    Toselli, Alejandro H.
    Rios-Vila, Antonio
    Calvo-Zaragoza, Jorge
    PATTERN RECOGNITION, 2023, 142
  • [24] End-to-end programmable computing systems
    Yao Xiao
    Guixiang Ma
    Nesreen K. Ahmed
    Mihai Capotă
    Theodore L. Willke
    Shahin Nazarian
    Paul Bogdan
    Communications Engineering, 2 (1):
  • [25] End-to-End Contextualized Document Indexing and Retrieval with Neural Networks
    Hofstaetter, Sebastian
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 2481 - 2481
  • [26] End-to-End diagnosis of breast biopsy images with transformers
    Mehta, Sachin
    Lu, Ximing
    Wu, Wenjun
    Weaver, Donald
    Hajishirzi, Hannaneh
    Elmore, Joann G.
    Shapiro, Linda G.
    MEDICAL IMAGE ANALYSIS, 2022, 79
  • [27] MFCNET: END-TO-END APPROACH FOR CHANGE DETECTION IN IMAGES
    Chen, Ying
    Xu Ouyang
    Agam, Gady
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 4008 - 4012
  • [28] On usage of an end-to-end deep neural architecture for handwritten digit string recognition
    Omidi, Zahra
    BabaAli, Bagher
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3009 - 3020
  • [29] Training an End-to-End System for Handwritten Mathematical Expression Recognition by Generated Patterns
    Anh Duc Le
    Nakagawa, Masaki
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1056 - 1061
  • [30] An End-to-End deep learning system for writer identification in handwritten Arabic manuscripts
    Chammas M.
    Makhoul A.
    Demerjian J.
    Dannaoui E.
    Multimedia Tools and Applications, 2024, 83 (18) : 54569 - 54589