A Multi-head Self-relation Network for Scene Text Recognition

被引：0

作者：

Zhou, Junwei ^{[1
,2
]}

Gao, Hongchao ^{[1
]}

Dai, Jiao ^{[1
]}

Liu, Dongqin ^{[1
]}

Han, Jizhong ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2021年

关键词：

D O I：

10.1109/ICPR48806.2021.9413339

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The text embedded in scene images can be seen everywhere in our lives. However, recognizing text from natural scene images is still a challenge because of its diverse shapes and distorted patterns. Recently, advanced recognition networks generally treat scene text recognition as a sequence prediction task. Although achieving excellent performance, these recognition networks consider the feature map cells as independent individuals and update cells state without utilizing the information of their related cells. And the local receptive field of traditional convolutional neural network (CNN) makes a single cell that cannot cover the whole text region in an image. Due to these issues, the existing recognition networks cannot extract the global context information in a visual scene. To deal with the above problems, we propose a Multi-head Self-relation Network(MSRN) for scene text recognition in this paper. The MSRN consists of several multihead self-relation layers, which are designed for extracting the global context information of a visual scene. Then the information of the related cells can be fused by multi-head self-relation layer. Furthermore, experiments over several public datasets demonstrate that our proposed recognition network achieves superior performance on several benchmark datasets including IC03, IC13, IC15, SVT-Perspective.

引用

页码：3969 / 3976

页数：8

共 50 条

[31] FV-DMHN: Dual Multi-Head Network for Finger Vein Recognition
An, Zhijin
Ren, Xiaokui
Tao, Zhiyong
IEEE ACCESS, 2024, 12 : 76909 - 76918
[32] Multi-head attention graph convolutional network model: End-to-end entity and relation joint extraction based on multi-head attention graph convolutional network
Tao, Zhihua
Ouyang, Chunping
Liu, Yongbin
Chung, Tonglee
Cao, Yixin
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (02) : 468 - 477
[33] Improving CRNN with EfficientNet-like feature extractor and multi-head attention for text recognition
Dinh Viet Sang
Le Tran Bao Cuong
SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 285 - 290
[34] Multi-label Text Classification Based on BiGRU and Multi-Head Self-Attention Mechanism
Luo, Tongtong
Shi, Nan
Jin, Meilin
Qin, Aolong
Tang, Jiacheng
Wang, Xihan
Gao, Quanli
Shao, Lianhe
2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 204 - 210
[35] MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding
Park, Geondo
Han, Chihye
Kim, Daeshik
Yoon, Wonjun
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1507 - 1515
[36] EEG-Based Emotion Recognition Using Convolutional Recurrent Neural Network with Multi-Head Self-Attention
Hu, Zhangfang
Chen, Libujie
Luo, Yuan
Zhou, Jingfan
APPLIED SCIENCES-BASEL, 2022, 12 (21):
[37] Music Emotion Recognition Using Multi-head Self-attention-Based Models
Xiao, Yao
Ruan, Haoxin
Zhao, Xujian
Jin, Peiquan
Cai, Xuebo
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 101 - 114
[38] MORAN: A Multi-Object Rectified Attention Network for scene text recognition
Luo, Canjie
Jin, Lianwen
Sun, Zenghui
PATTERN RECOGNITION, 2019, 90 : 109 - 118
[39] Temporal Residual Network Based Multi-Head Attention Model for Arabic Handwriting Recognition
Zouari, Ramzi
Othmen, Dalila
Boubaker, Houcine
Kherallah, Monji
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (3A) : 469 - 476
[40] Augmented multi-head classification network: MHATT
Cayce, Garrett I.
Depoian, Arthur C., II
Bailey, Colleen P.
Guturu, Parthasarathy
SIGNAL PROCESSING, SENSOR/INFORMATION FUSION, AND TARGET RECOGNITION XXXII, 2023, 12547

← 1 2 3 4 5 →