AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

被引：8

作者：

Kass, Dmitrijs ^{[1
]}

Vats, Ekta ^{[2
]}

机构：

[1] Uppsala Univ, Dept Informat Technol, Uppsala, Sweden

[2] Uppsala Univ, Ctr Digital Humanities Uppsala, Dept ALM, Uppsala, Sweden

来源：

DOCUMENT ANALYSIS SYSTEMS, DAS 2022 | 2022年 / 13237卷

关键词：

Handwritten text recognition; Attention encoder-decoder networks; Sequence-to-sequence model; Transfer learning; Multi-writer; SEQUENCE;

D O I：

10.1007/978-3-031-06555-2_34

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at GitHub (https://github.com/dmitrijsk/AttentionHTR).

引用

页码：507 / 522

页数：16

共 50 条

[1] Attention-based encoder-decoder networks for workflow recognition
Zhang, Min
Hu, Haiyang
Li, Zhongjin
Chen, Jie
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 34973 - 34995
[2] Attention-based encoder-decoder networks for workflow recognition
Min Zhang
Haiyang Hu
Zhongjin Li
Jie Chen
[J]. Multimedia Tools and Applications, 2021, 80 : 34973 - 34995
[3] Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
Prabu, S.
Sundar, K. Joseph Abraham
[J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2071 - 2086
[4] A GRU-based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition
Zhang, Jianshu
Du, Jun
Dai, Lirong
[J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 902 - 907
[5] Recognition of Japanese historical text lines by an attention-based encoder-decoder and text line generation
Le, Anh Duc
Mochihashi, Daichi
Masuda, Katsuya
Mima, Hideki
Ly, Nam Tuan
[J]. PROCEEDINGS OF THE 2019 WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING (HIP' 19), 2019, : 37 - 41
[6] An Encoder-Decoder Approach to Handwritten Mathematical Expression Recognition with Multi-head Attention and Stacked Decoder
Ding, Haisong
Chen, Kai
Huo, Qiang
[J]. DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 602 - 616
[7] Natural Scene Text Recognition Based on Encoder-Decoder Framework
Zuo, Ling-Qun
Sun, Hong-Mei
Mao, Qi-Chao
Qi, Rong
Jia, Rui-Sheng
[J]. IEEE ACCESS, 2019, 7 : 62616 - 62623
[8] Multiple attention-based encoder-decoder networks for gas meter character recognition
Li, Weidong
Wang, Shuai
Ullah, Inam
Zhang, Xuehai
Duan, Jinlong
[J]. SCIENTIFIC REPORTS, 2022, 12 (01)
[9] A Dual Attention Encoder-Decoder Text Summarization Model
Hakami, Nada Ali
Mahmoud, Hanan Ahmed Hosni
[J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3697 - 3710
[10] Video Summarization With Attention-Based Encoder-Decoder Networks
Ji, Zhong
Xiong, Kailin
Pang, Yanwei
Li, Xuelong
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1709 - 1717

← 1 2 3 4 5 →