PIEED: Position information enhanced encoder-decoder framework for scene text recognition

被引:12
|
作者
Ma, Xitao [1 ]
He, Kai [1 ]
Zhang, Dazhuang [1 ]
Li, Dashuang [1 ]
机构
[1] Tianjin Univ, Tianjin 300072, Peoples R China
关键词
Scene text recognition; Position information; Long short term memory; Sequence-to-sequence learning; LOCALIZATION;
D O I
10.1007/s10489-021-02219-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene text recognition (STR) technology has a rapid development with the rise of deep learning. Recently, the encoder-decoder framework based on attention mechanism is widely used in STR for better recognition. However, the commonly used Long Short Term Memory (LSTM) network in the framework tends to ignore certain position or visual information. To address this problem, we propose a Position Information Enhanced Encoder-Decoder (PIEED) framework for scene text recognition, in which an addition position information enhancement (PIE) module is proposed to compensate the shortage of the LSTM network. Our module tends to retain more position information in the feature sequence, as well as the context information extracted by the LSTM network, which is helpful to improve the recognition accuracy of the text without context. Besides that, our fusion decoder can make full use of the output of the proposed module and the LSTM network, so as to independently learn and preserve useful features, which is helpful to improve the recognition accuracy while not increase the number of arguments. Our overall framework can be trained end-to-end only using images and ground truth. Extensive experiments on several benchmark datasets demonstrate that our proposed framework surpass state-of-the-art ones on both regular and irregular text recognition.
引用
收藏
页码:6698 / 6707
页数:10
相关论文
共 50 条
  • [1] PIEED: Position information enhanced encoder-decoder framework for scene text recognition
    Xitao Ma
    Kai He
    Dazhuang Zhang
    Dashuang Li
    [J]. Applied Intelligence, 2021, 51 : 6698 - 6707
  • [2] Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition
    Cui, Mengmeng
    Wang, Wei
    Zhang, Jinjin
    Wang, Liang
    [J]. DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 156 - 170
  • [3] Natural Scene Text Recognition Based on Encoder-Decoder Framework
    Zuo, Ling-Qun
    Sun, Hong-Mei
    Mao, Qi-Chao
    Qi, Rong
    Jia, Rui-Sheng
    [J]. IEEE ACCESS, 2019, 7 : 62616 - 62623
  • [4] Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
    Prabu, S.
    Sundar, K. Joseph Abraham
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2071 - 2086
  • [5] An Algorithm Based on Text Position Correction and Encoder-Decoder Network for Text Recognition in the Scene Image of Visual Sensors
    Huang, Zhiwei
    Lin, Jinzhao
    Yang, Hongzhi
    Wang, Huiqian
    Bai, Tong
    Liu, Qinghui
    Pang, Yu
    [J]. SENSORS, 2020, 20 (10)
  • [6] SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network
    Liu, Zichuan
    Li, Yixing
    Ren, Fengbo
    Goh, Wang Ling
    Yu, Hao
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7194 - 7201
  • [7] Machine translation of cortical activity to text with an encoder-decoder framework
    Makin, Joseph G.
    Moses, David A.
    Chang, Edward F.
    [J]. NATURE NEUROSCIENCE, 2020, 23 (04) : 575 - +
  • [8] AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
    Kass, Dmitrijs
    Vats, Ekta
    [J]. DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 507 - 522
  • [9] A Detection and Verification Model Based on SSD and Encoder-Decoder Network for Scene Text Detection
    Gao, Xue
    Han, Siyi
    Luo, Cong
    [J]. IEEE ACCESS, 2019, 7 : 71299 - 71310
  • [10] Correlation Encoder-Decoder Model for Text Generation
    Zhang, Xu
    Li, Yifeng
    Peng, Xueping
    Qiao, Xinxiao
    Zhang, Hui
    Lu, Wenpeng
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,