End-to-End Handwritten Text Detection and Transcription in Full Pages

被引:12
|
作者
Carbonell, Manuel [1 ]
Mas, Joan [2 ]
Villegas, Mauricio [1 ]
Fornes, Alicia [2 ]
Llados, Josep [2 ]
机构
[1] Omni Us, Berlin, Germany
[2] Comp Vis Ctr, Barcelona, Spain
关键词
Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;
D O I
10.1109/ICDARW.2019.40077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
引用
收藏
页码:29 / 34
页数:6
相关论文
共 50 条
  • [21] Towards End-to-End Unified Scene Text Detection and Layout Analysis
    Long, Shangbang
    Qin, Siyang
    Panteleev, Dmitry
    Bissacco, Alessandro
    Fujii, Yasuhisa
    Raptis, Michalis
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1039 - 1049
  • [22] CurT: End-to-End Text Line Detection in Historical Documents with Transformers
    Kiessling, Benjamin
    FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 34 - 48
  • [23] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
    Alnefaie, Ahlam
    Gupta, Deepak
    Bhuyan, Monowar H.
    Razzak, Imran
    Gupta, Prashant
    Prasad, Mukesh
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [24] An end-to-end text spotter with text relation networks
    Jianguo Jiang
    Baole Wei
    Min Yu
    Gang Li
    Boquan Li
    Chao Liu
    Min Li
    Weiqing Huang
    Cybersecurity, 4
  • [25] Feature Fusion Pyramid Network for End-to-End Scene Text Detection
    Wu, Yirui
    Zhang, Lilai
    Li, Hao
    Zhang, Yunfei
    Wan, Shaohua
    ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23 (11)
  • [26] An end-to-end text spotter with text relation networks
    Jiang, Jianguo
    Wei, Baole
    Yu, Min
    Li, Gang
    Li, Boquan
    Liu, Chao
    Li, Min
    Huang, Weiqing
    CYBERSECURITY, 2021, 4 (01)
  • [27] An End-to-End Scene Text Recognition for Bilingual Text
    Albalawi, Bayan M.
    Jamal, Amani T.
    Al Khuzayem, Lama A.
    Alsaedi, Olaa A.
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
  • [28] PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition
    Dezhi Peng
    Lianwen Jin
    Yuliang Liu
    Canjie Luo
    Songxuan Lai
    International Journal of Computer Vision, 2022, 130 : 2623 - 2645
  • [29] PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition
    Peng, Dezhi
    Jin, Lianwen
    Liu, Yuliang
    Luo, Canjie
    Lai, Songxuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (11) : 2623 - 2645
  • [30] End-to-End Full Projector Compensation
    Huang, Bingyao
    Sun, Tao
    Ling, Haibin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (06) : 2953 - 2967