AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

被引:8
|
作者
Kass, Dmitrijs [1 ]
Vats, Ekta [2 ]
机构
[1] Uppsala Univ, Dept Informat Technol, Uppsala, Sweden
[2] Uppsala Univ, Ctr Digital Humanities Uppsala, Dept ALM, Uppsala, Sweden
来源
关键词
Handwritten text recognition; Attention encoder-decoder networks; Sequence-to-sequence model; Transfer learning; Multi-writer; SEQUENCE;
D O I
10.1007/978-3-031-06555-2_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at GitHub (https://github.com/dmitrijsk/AttentionHTR).
引用
收藏
页码:507 / 522
页数:16
相关论文
共 50 条
  • [1] Attention-based encoder-decoder networks for workflow recognition
    Zhang, Min
    Hu, Haiyang
    Li, Zhongjin
    Chen, Jie
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 34973 - 34995
  • [2] Attention-based encoder-decoder networks for workflow recognition
    Min Zhang
    Haiyang Hu
    Zhongjin Li
    Jie Chen
    [J]. Multimedia Tools and Applications, 2021, 80 : 34973 - 34995
  • [3] Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
    Prabu, S.
    Sundar, K. Joseph Abraham
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2071 - 2086
  • [4] A GRU-based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition
    Zhang, Jianshu
    Du, Jun
    Dai, Lirong
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 902 - 907
  • [5] Recognition of Japanese historical text lines by an attention-based encoder-decoder and text line generation
    Le, Anh Duc
    Mochihashi, Daichi
    Masuda, Katsuya
    Mima, Hideki
    Ly, Nam Tuan
    [J]. PROCEEDINGS OF THE 2019 WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING (HIP' 19), 2019, : 37 - 41
  • [6] An Encoder-Decoder Approach to Handwritten Mathematical Expression Recognition with Multi-head Attention and Stacked Decoder
    Ding, Haisong
    Chen, Kai
    Huo, Qiang
    [J]. DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 602 - 616
  • [7] Natural Scene Text Recognition Based on Encoder-Decoder Framework
    Zuo, Ling-Qun
    Sun, Hong-Mei
    Mao, Qi-Chao
    Qi, Rong
    Jia, Rui-Sheng
    [J]. IEEE ACCESS, 2019, 7 : 62616 - 62623
  • [8] Multiple attention-based encoder-decoder networks for gas meter character recognition
    Li, Weidong
    Wang, Shuai
    Ullah, Inam
    Zhang, Xuehai
    Duan, Jinlong
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [9] A Dual Attention Encoder-Decoder Text Summarization Model
    Hakami, Nada Ali
    Mahmoud, Hanan Ahmed Hosni
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3697 - 3710
  • [10] Video Summarization With Attention-Based Encoder-Decoder Networks
    Ji, Zhong
    Xiong, Kailin
    Pang, Yanwei
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1709 - 1717