End-to-End Handwritten Text Detection and Transcription in Full Pages

被引:12
|
作者
Carbonell, Manuel [1 ]
Mas, Joan [2 ]
Villegas, Mauricio [1 ]
Fornes, Alicia [2 ]
Llados, Josep [2 ]
机构
[1] Omni Us, Berlin, Germany
[2] Comp Vis Ctr, Barcelona, Spain
关键词
Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;
D O I
10.1109/ICDARW.2019.40077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.
引用
收藏
页码:29 / 34
页数:6
相关论文
共 50 条
  • [31] End-to-End Optical Character Recognition for Bengali Handwritten Words
    Safir, Farisa Benta
    Ohi, Abu Quwsar
    Mridha, M. F.
    Monowar, Muhammad Mostafa
    Hamid, Md Abdul
    2021 IEEE NATIONAL COMPUTING COLLEGES CONFERENCE (NCCC 2021), 2021, : 1067 - +
  • [32] TTDNet: An End-to-End Traffic Text Detection Framework for Open Driving Environments
    Wang, Runmin
    Zhu, Yanbin
    Chen, Hua
    Zhu, Zhenlin
    Zhang, Xiangyu
    Ding, Yajun
    Qian, Shengyou
    Gao, Changxin
    Liu, Li
    Sang, Nong
    IEEE Transactions on Intelligent Transportation Systems, 2024, 25 (12) : 19770 - 19784
  • [33] End-to-End Video Text Spotting with Transformer
    Wu, Weijia
    Cai, Yuanqiang
    Shen, Chunhua
    Zhang, Debing
    Fu, Ying
    Zhou, Hong
    Luo, Ping
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (09) : 4019 - 4035
  • [34] AutoText: An End-to-End AutoAI Framework for Text
    Chaudhary, Arunima
    Issak, Alayt
    Kate, Kiran
    Katsis, Yannis
    Valente, Abel
    Wang, Dakuo
    Evfimievski, Alexandre
    Gurajada, Sairam
    Kawas, Ban
    Malossi, Cristiano
    Popa, Lucian
    Pedapati, Tejaswini
    Samulowitz, Horst
    Wistuba, Martin
    Li, Yunyao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16001 - 16003
  • [35] Towards Unconstrained End-to-End Text Spotting
    Qin, Siyang
    Bissacco, Alessandro
    Raptis, Michalis
    Fujii, Yasuhisa
    Xiao, Ying
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4703 - 4713
  • [36] End-to-End Neural Text Classification for Tibetan
    Qun, Nuo
    Li, Xing
    Qiu, Xipeng
    Huang, Xuanjing
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 472 - 480
  • [37] Scene text spotting based on end-to-end
    Wei G.
    Rong W.
    Liang Y.
    Xiao X.
    Liu X.
    Journal of Intelligent and Fuzzy Systems, 2021, 40 (05): : 8871 - 8881
  • [38] EraseNet: End-to-End Text Removal in the Wild
    Liu, Chongyu
    Liu, Yuliang
    Jin, Lianwen
    Zhang, Shuaitao
    Luo, Canjie
    Wang, Yongpan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8760 - 8775
  • [39] End-to-End Differentiable GANs for Text Generation
    Kumar, Sachin
    Tsvetkov, Yulia
    NEURIPS WORKSHOPS, 2020, 2020, 137 : 118 - 128
  • [40] End-to-end Learning for Short Text Expansion
    Tang, Jian
    Wang, Yue
    Zheng, Kai
    Mei, Qiaozhu
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1105 - 1113