Are End-to-End Systems Really Necessary for NER on Handwritten Document Images?

被引:4
|
作者
Tueselmann, Oliver [1 ]
Wolf, Fabian [1 ]
Fink, Gernot A. [1 ]
机构
[1] TU Dortmund Univ, Dept Comp Sci, D-44227 Dortmund, Germany
关键词
Named entity recognition; Document image analysis; Information retrieval; Handwritten documents; RECOGNITION;
D O I
10.1007/978-3-030-86331-9_52
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Named entities (NEs) are fundamental in the extraction of information from text. The recognition and classification of these entities into predefined categories is called Named Entity Recognition (NER) and plays a major role in Natural Language Processing. However, only a few works consider this task with respect to the document image domain. The approaches are either based on a two-stage or end-to-end architecture. A two-stage approach transforms the document image into a textual representation and determines the NEs using a textual NER. The end-to-end approach, on the other hand, avoids the explicit recognition step at text level and determines the NEs directly on image level. Current approaches that try to tackle the task of NER on segmented word images use end-to-end architectures. This is motivated by the assumption that handwriting recognition is too erroneous to allow for an effective application of textual NLP methods. In this work, we present a two-stage approach and compare it against state-of-the-art end-to-end approaches. Due to the lack of datasets and evaluation protocols, such a comparison is currently difficult. Therefore, we manually annotated the known IAM and George Washington datasets with NE labels and publish them along with optimized splits and an evaluation protocol. Our experiments show, contrary to the common belief, that a two-stage model can achieve higher scores on all tested datasets.
引用
收藏
页码:808 / 822
页数:15
相关论文
共 50 条
  • [41] Optimizing end-to-end distortion in MIMO systems
    Holliday, T
    Goldsmith, A
    2005 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), VOLS 1 AND 2, 2005, : 1671 - 1675
  • [42] SECURE END-TO-END DELEGATIONS IN DISTRIBUTED SYSTEMS
    HARDJONO, T
    OHTA, T
    COMPUTER COMMUNICATIONS, 1994, 17 (03) : 230 - 238
  • [43] Functional Architecture of End-to-End Reconfigurable Systems
    Moessner, Klaus
    Luo, Jesse
    Mohyeldin, Eliman
    Grandblaise, David
    Kloeck, Clemens
    Martoyo, Ihan
    Sallent, Oriol
    Demestichas, P.
    Dimitrakopoulos, G.
    Tsagkaris, K.
    Olaziregi, N.
    2006 IEEE 63RD VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2006, : 196 - +
  • [44] Sentiment Adaptive End-to-End Dialog Systems
    Shi, Weiyan
    Yu, Zhou
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1509 - 1519
  • [45] A Framework for end-to-end approach to Systems Integration
    Jain R.
    Chandrasekaran A.
    Erol O.
    International Journal of Industrial and Systems Engineering, 2010, 5 (01) : 79 - 109
  • [46] End-to-End Architecture for Adaptive Communication Systems
    Boufidis, Z.
    Alonistioti, N.
    Stamatelatos, M.
    Vogler, J.
    Luecking, U.
    Kloeck, C.
    Grandblaise, D.
    Bourse, D.
    2006 IEEE 64TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2006, : 3027 - +
  • [47] On Optimum End-to-End Distortion in MIMO Systems
    Jinhui Chen
    Dirk T. M. Slock
    EURASIP Journal on Wireless Communications and Networking, 2009
  • [48] End-to-End Learning for Fair Ranking Systems
    Kotary, James
    Fioretto, Ferdinando
    Van Hentenryck, Pascal
    Zhu, Ziwei
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 3520 - 3530
  • [49] Key Technologies of End-to-End Reconfigurable Systems
    Liu, Y. L.
    Zeng, Z. M.
    Huo, Y. H.
    ITESS: 2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES, PT 1, 2008, : 660 - 666
  • [50] Business Models of End-to-End Reconfigurable Systems
    Bourse, Didier
    El-Khazen, Karim
    Lee, Al
    Boscovic, Dragan
    2006 IEEE 63RD VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-6, 2006, : 57 - +