End-to-End Handwritten Text Detection and Transcription in Full Pages

被引:12
|
作者
Carbonell, Manuel [1 ]
Mas, Joan [2 ]
Villegas, Mauricio [1 ]
Fornes, Alicia [2 ]
Llados, Josep [2 ]
机构
[1] Omni Us, Berlin, Germany
[2] Comp Vis Ctr, Barcelona, Spain
关键词
Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;
D O I
10.1109/ICDARW.2019.40077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
引用
收藏
页码:29 / 34
页数:6
相关论文
共 50 条
  • [41] BiPass: Enabling End-to-End Full Duplex
    Chen, Lu
    Wu, Fei
    Xu, Jiaqi
    Srinivasan, Kannan
    Shroff, Ness
    PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING (MOBICOM '17), 2017, : 114 - 126
  • [42] An End-to-End Approach for Recognition of Modern and Historical Handwritten Numeral Strings
    Hochuli, Andre G.
    Britto, Alceu S., Jr.
    Barddal, Jean P.
    Oliveira, Luiz E. S.
    Sabourin, Robert
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [43] A comprehensive comparison of end-to-end approaches for handwritten digit string recognition
    Hochuli, Andre G.
    Britto Jr, Alceu S.
    Saji, David A.
    Saavedra, Jose M.
    Sabourin, Robert
    Oliveira, Luiz S.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165 (165)
  • [44] End-to-End Machine Learning Solution for Recognizing Handwritten Arabic Documents
    Shtaiwi, Reem E.
    Abandah, Gheith A.
    Sawalhah, Safaa A.
    2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, : 180 - 185
  • [45] Are End-to-End Systems Really Necessary for NER on Handwritten Document Images?
    Tueselmann, Oliver
    Wolf, Fabian
    Fink, Gernot A.
    DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 808 - 822
  • [46] Towards End-to-End Text Spotting in Natural Scenes
    Wang, Peng
    Li, Hui
    Shen, Chunhua
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 7266 - 7281
  • [47] Unconstrained end-to-end text reading with feature rectification
    Du, Chen
    Wang, Yanna
    Wang, Chunheng
    Xiao, Baihua
    Shi, Cunzhao
    PATTERN RECOGNITION LETTERS, 2021, 149 : 1 - 8
  • [48] A COMPARATIVE STUDY ON END-TO-END SPEECH TO TEXT TRANSLATION
    Bahar, Parnia
    Bieschke, Tobias
    Ney, Hermann
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 792 - 799
  • [49] End-to-End Speech Synthesis for Bangla with Text Normalization
    Pial, Tanzir Islam
    Aunti, Shahreen Salim
    Ahmed, Shabbir
    Heickal, Hasnain
    2018 5TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE/ INTELLIGENCE AND APPLIED INFORMATICS (CSII 2018), 2018, : 66 - 71
  • [50] SimulSpeech: End-to-End Simultaneous Speech to Text Translation
    Ren, Yi
    Liu, Jinglin
    Tan, Xu
    Zhang, Chen
    Qin, Tao
    Zhao, Zhou
    Liu, Tie-Yan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3787 - 3796