Text proposals with location-awareness-attention network for arbitrarily shaped scene text detection and recognition

被引：12

作者：

Zhong, Dajian ^{[1
]}

Lyu, Shujing ^{[1
,2
]}

Shivakumara, Palaiahankote ^{[3
]}

Pal, Umapada ^{[4
]}

Lu, Yue ^{[1
,2
]}

机构：

[1] East China Normal Univ, Shanghai Key Lab Multidimens Informat Proc, Shanghai 200241, Peoples R China

[2] East China Normal Univ, Sch Commun & Elect Engn, Shanghai 200241, Peoples R China

[3] Univ Malaya, Fac Comp Sci & Informat Technol FSKTM, Kuala Lumpur 50603, Malaysia

[4] Indian Stat Inst, CVPR Unit, Kolkata 700108, India

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2022年 / 205卷

基金：

中国国家自然科学基金;

关键词：

Scene text detection; Scene text recognition; Text proposal; Attention model; Location-awareness-attention model; NEURAL-NETWORK; IMAGE;

D O I：

10.1016/j.eswa.2022.117564

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unlike existing models that aim to address the challenge of scene text detection and recognition separately, the proposed work aims to address both text detection and recognition using a single architecture to deal with arbitrarily oriented/shaped text. Towards this aim, a novel Text Proposal with Location-AwarenessAttention Network (TPLAANet) for arbitrarily oriented/shaped text detection and recognition is proposed. For text detection, the proposed method explores central mask prediction for locating text instances, bounding box regression branch for tight bounding boxes, and mask branch for accurate positions of arbitrarily oriented/shaped text instances. For text recognition, the proposed method explores character information using a Location-Awareness-Attention Network (LAAN), which learns a two-dimensional attention weight and improves the recognition performance. To test the efficacy of the proposed model, we consider the commonly used horizontal and multi-oriented natural scene text datasets, namely, ICDAR2013, ICDAR2015, and the arbitrarily shaped scene text datasets, namely, Total-Text and CTW1500 for experimentation. Experimental results are provided to validate the effectiveness of the proposed method. The code is available at: https: //codeocean.com/capsule/5666319/tree/v1.

引用

页数：15

共 50 条

[31] Review of Scene Text Detection and Recognition
Han Lin
Peng Yang
Fanlong Zhang
Archives of Computational Methods in Engineering, 2020, 27 : 433 - 454
[32] Scene text detection and recognition: a survey
Naiemi, Fatemeh
Ghods, Vahid
Khalesi, Hassan
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (14) : 20255 - 20290
[33] Scene text detection and recognition: a survey
Fatemeh Naiemi
Vahid Ghods
Hassan Khalesi
Multimedia Tools and Applications, 2022, 81 : 20255 - 20290
[34] Review of Scene Text Detection and Recognition
Lin, Han
Yang, Peng
Zhang, Fanlong
ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2020, 27 (02) : 433 - 454
[35] Margin Guidance Network for Arbitrary-shaped Scene Text Detection
Li, Xin
Wu, Xingjiao
Ma, Tianlong
Zhou, Zhao
Chen, Luhui
He, Liang
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 1111 - 1117
[36] Video Scene Text Frames Categorization for Text Detection and Recognition
Qin, Longfei
Shivakumara, Palaiahnakote
Lu, Tong
Pal, Umapada
Tan, Chew Lim
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3886 - 3891
[37] Fast arbitrary shaped scene text detection via text discriminator
Guizhou Institute of Technology, Guiyzhou, Guiyang, China
不详
J. Phys. Conf. Ser., 1742, 1
[38] SaHAN: Scale-aware hierarchical attention network for scene text recognition
Zhang, Jiaxin
Luo, Canjie
Jin, Lianwen
Wang, Tianwei
Li, Ziyan
Zhou, Weiying
PATTERN RECOGNITION LETTERS, 2020, 136 : 205 - 211
[39] SLOAN: Scale-Adaptive Orientation Attention Network for Scene Text Recognition
Dai, Pengwen
Zhang, Hua
Cao, Xiaochun
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1687 - 1701
[40] MORAN: A Multi-Object Rectified Attention Network for scene text recognition
Luo, Canjie
Jin, Lianwen
Sun, Zenghui
PATTERN RECOGNITION, 2019, 90 : 109 - 118

← 1 2 3 4 5 →