Attention-based Text Recognition in the Wild

被引：0

作者：

Yan, Zhi-Chen ^{[1
]}

Yu, Stephanie A. ^{[2
]}

机构：

[1] Facebook Res, 1 Hacker Way, Menlo Pk, CA 94025 USA

[2] West Isl Sch, Pokfulam, 250 Victoria Rd, Hong Kong, Peoples R China

来源：

PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA) | 2020年

关键词：

Attention; Convolution; Deep Learning; LSTM; Text Recognition;

D O I：

10.5220/0009970200420049

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recognizing texts in real-world scenes is an important research topic in computer vision. Many deep learning based techniques have been proposed. Such techniques typically follow an encoder-decoder architecture, and use a sequence of feature vectors as the intermediate representation. In this approach, useful 2D spatial information in the input image may be lost due to vector-based encoding. In this paper, we formulate scene text recognition as a spatiotemporal sequence translation problem, and introduce a novel attention based spatiotemporal decoding framework. We first encode an image as a spatiotemporal sequence, which is then translated into a sequence of output characters using the aforementioned decoder. Our encoding and decoding stages are integrated to form an end-to-end trainable deep network. Experimental results on multiple benchmarks, including IIIT5k, SVT, ICDAR and RCTW-17, indicate that our method can significantly outperform conventional attention frameworks.

引用

页码：42 / 49

页数：8

共 50 条

[1] Adaptive embedding gate for attention-based scene text recognition
Chen, Xiaoxue
Wang, Tianwei
Zhu, Yuanzhi
Jin, Lianwen
Luo, Canjie
NEUROCOMPUTING, 2020, 381 : 261 - 271
[2] Dynamic Receptive Field Adaptation for Attention-Based Text Recognition
Qin, Haibo
Yang, Chun
Zhu, Xiaobin
Yin, Xucheng
DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 225 - 239
[3] Attention-Based Deep Learning Model for Arabic Handwritten Text Recognition
Gader T.B.A.
Echi A.K.
Machine Graphics and Vision, 2022, 31 (1-4): : 49 - 73
[4] An Attention-Based Convolutional Recurrent Neural Networks for Scene Text Recognition
Alshawi, Adil Abdullah Abdulhussein
Tanha, Jafar
Balafar, Mohammad Ali
IEEE ACCESS, 2024, 12 : 8123 - 8134
[5] STAN: A sequential transformation attention-based network for scene text recognition
Lin, Qingxiang
Luo, Canjie
Jin, Lianwen
Lai, Songxuan
PATTERN RECOGNITION, 2021, 111
[6] Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
Prabu, S.
Sundar, K. Joseph Abraham
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2071 - 2086
[7] Attention-Based Neural Text Segmentation
Badjatiya, Pinkesh
Kurisinkel, Litton J.
Gupta, Manish
Varma, Vasudeva
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 180 - 193
[8] Attention-Based Deep Neural Network and Its Application to Scene Text Recognition
He, Haizhen
Li, Jiehan
2019 IEEE 11TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2019), 2019, : 672 - 677
[9] Recognition of Japanese historical text lines by an attention-based encoder-decoder and text line generation
Le, Anh Duc
Mochihashi, Daichi
Masuda, Katsuya
Mima, Hideki
Ly, Nam Tuan
PROCEEDINGS OF THE 2019 WORKSHOP ON HISTORICAL DOCUMENT IMAGING AND PROCESSING (HIP' 19), 2019, : 37 - 41
[10] Attention-Based Models for Speech Recognition
Chorowski, Jan
Bahdanau, Dzmitry
Serdyuk, Dmitriy
Cho, Kyunghyun
Bengio, Yoshua
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28

← 1 2 3 4 5 →