Bidirectional Scene Text Recognition with a Single Decoder

被引:6
|
作者
Bleeker, Maurits [1 ]
de Rijke, Maarten [1 ]
机构
[1] Univ Amsterdam, Amsterdam, Netherlands
关键词
D O I
10.3233/FAIA200404
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scene Text Recognition (STR) is the problem of recognizing the correct word or character sequence in a cropped word image. To obtain more robust output sequences, the notion of bidirectional STR has been introduced. So far, bidirectional STRs have been implemented by using two separate decoders; one for left-to-right decoding and one for right-to-left. Having two separate decoders for almost the same task with the same output space is undesirable from a computational and optimization point of view. We introduce the Bidirectional Scene Text Transformer (Bi-STET), a novel bidirectional STR method with a single decoder for bidirectional text decoding. With its single decoder, Bi-STET outperforms methods that apply bidirectional decoding by using two separate decoders while also being more efficient than those methods, Furthermore, we achieve or beat state-of-the-art (SOTA) methods on all STR benchmarks with Bi-STET. Finally, we provide analyzes and insights into the performance of Bi-STET.
引用
收藏
页码:2664 / 2671
页数:8
相关论文
共 50 条
  • [1] Bidirectional extraction and recognition of scene text with layout consistency
    Ryota Hinami
    Xinhao Liu
    Naoki Chiba
    Shin’ichi Satoh
    [J]. International Journal on Document Analysis and Recognition (IJDAR), 2016, 19 : 83 - 98
  • [2] Bidirectional extraction and recognition of scene text with layout consistency
    Hinami, Ryota
    Liu, Xinhao
    Chiba, Naoki
    Satoh, Shin'ichi
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (02) : 83 - 98
  • [3] DISTILLING KNOWLEDGE OF BIDIRECTIONAL LANGUAGE MODEL FOR SCENE TEXT RECOGNITION
    Orihashi, Shota
    Yamazaki, Yoshihiro
    Uchida, Mihiro
    Takashima, Akihiko
    Masumura, Ryo
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2165 - 2169
  • [4] Natural Scene Text Recognition Based on Encoder-Decoder Framework
    Zuo, Ling-Qun
    Sun, Hong-Mei
    Mao, Qi-Chao
    Qi, Rong
    Jia, Rui-Sheng
    [J]. IEEE ACCESS, 2019, 7 : 62616 - 62623
  • [5] Scene Text Recognition Based on Bidirectional LSTM and Deep Neural Network
    Kantipudi, M. V. V. Prasad
    Kumar, Sandeep
    Jha, Ashish Kumar
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [6] Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition
    Cui, Mengmeng
    Wang, Wei
    Zhang, Jinjin
    Wang, Liang
    [J]. DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 156 - 170
  • [7] Scene text recognition with context-aware autonomous bidirectional iterative models
    Zhao, Xiaoqing
    Xu, Miaomiao
    Li, Yanbing
    Huang, Hao
    Silamu, Wushour
    [J]. Journal of Intelligent and Fuzzy Systems, 2024, 46 (04): : 8605 - 8616
  • [8] Gate-based Bidirectional Interactive Decoding Network for Scene Text Recognition
    Gao, Yunze
    Chen, Yingying
    Wang, Jinqiao
    Lu, Hanqing
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2273 - 2276
  • [9] PIEED: Position information enhanced encoder-decoder framework for scene text recognition
    Ma, Xitao
    He, Kai
    Zhang, Dazhuang
    Li, Dashuang
    [J]. APPLIED INTELLIGENCE, 2021, 51 (10) : 6698 - 6707
  • [10] PIEED: Position information enhanced encoder-decoder framework for scene text recognition
    Xitao Ma
    Kai He
    Dazhuang Zhang
    Dashuang Li
    [J]. Applied Intelligence, 2021, 51 : 6698 - 6707