End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network

被引:22
|
作者
Coquenet, Denis [1 ,2 ,3 ]
Chatelain, Clement [3 ,4 ]
Paquet, Thierry [1 ,2 ,3 ]
机构
[1] LITIS EA 4108, F-76800 Saint Etienne Du Rouvray, France
[2] Univ Rouen Normandy, F-76000 Rouen, France
[3] Normandy Univ, F-14032 Caen, France
[4] INSA Rouen Normandy, F-76800 Saint Etienne Du Rouvray, France
关键词
Seq2Seq model; hybrid attention; segmentation-free; paragraph handwriting recognition; fully convolutional network; encoder-decoder; optical character recognition; LINE SEGMENTATION; MARKOV-MODELS; HYBRID;
D O I
10.1109/TPAMI.2022.3144899
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unconstrained handwritten text recognition remains challenging for computer vision systems. Paragraph text recognition is traditionally achieved by two models: the first one for line segmentation and the second one for text line recognition. We propose a unified end-to-end model using hybrid attention to tackle this task. This model is designed to iteratively process a paragraph image line by line. It can be split into three modules. An encoder generates feature maps from the whole paragraph image. Then, an attention module recurrently generates a vertical weighted mask enabling to focus on the current text line features. This way, it performs a kind of implicit line segmentation. For each text line features, a decoder module recognizes the character sequence associated, leading to the recognition of a whole paragraph. We achieve state-of-the-art character error rate at paragraph level on three popular datasets: 1.91% for RIMES, 4.45% for IAM and 3.59% for READ 2016. Our code and trained model weights are available at https://github.com/FactoDeepLearning/VerticalAttentionOCR.
引用
收藏
页码:508 / 524
页数:17
相关论文
共 50 条
  • [1] End-to-end Handwritten Chinese Paragraph Text Recognition Using Residual Attention Networks
    Wang, Yintong
    Yang, Yingjie
    Chen, Haiyan
    Zheng, Hao
    Chang, Heyou
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 34 (01): : 371 - 388
  • [2] Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
    Bluche, Theodore
    Louradour, Jerome
    Messina, Ronaldo
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1050 - 1055
  • [3] An end-to-end handwritten text recognition method using residual attention networks
    Wang Y.-T.
    Zheng H.
    Chang H.-Y.
    Li S.
    Kongzhi yu Juece/Control and Decision, 2023, 38 (07): : 1825 - 1834
  • [4] End-to-end attention convolutional recurrent network for online handwritten Chinese text recognition
    Qu, Xiwen
    Wu, Zhihong
    Huang, Jun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (23) : 62541 - 62558
  • [5] Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
    Bluche, Theodore
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [6] End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning
    Malhotra, Ruchika
    Addis, Maru Tesfaye
    IEEE ACCESS, 2023, 11 : 99535 - 99545
  • [7] End-to-End page-Level assessment of handwritten text recognition
    Vidal, Enrique
    Toselli, Alejandro H.
    Rios-Vila, Antonio
    Calvo-Zaragoza, Jorge
    PATTERN RECOGNITION, 2023, 142
  • [8] End-to-End Chinese Image Text Recognition with Attention Model
    Sheng, Fenfen
    Zhai, Chuanlei
    Chen, Zhineng
    Xu, Bo
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 180 - 189
  • [9] Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model
    Carbonell, Manuel
    Villegas, Mauricio
    Fornes, Alicia
    Llados, Josep
    2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 399 - 404
  • [10] END-TO-END CHINESE TEXT RECOGNITION
    Hu, Jie
    Guo, Tszhang
    Cao, Ji
    Zhang, Changshui
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 1407 - 1411