A Multi-head Self-relation Network for Scene Text Recognition

被引:0
|
作者
Zhou, Junwei [1 ,2 ]
Gao, Hongchao [1 ]
Dai, Jiao [1 ]
Liu, Dongqin [1 ]
Han, Jizhong [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
D O I
10.1109/ICPR48806.2021.9413339
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The text embedded in scene images can be seen everywhere in our lives. However, recognizing text from natural scene images is still a challenge because of its diverse shapes and distorted patterns. Recently, advanced recognition networks generally treat scene text recognition as a sequence prediction task. Although achieving excellent performance, these recognition networks consider the feature map cells as independent individuals and update cells state without utilizing the information of their related cells. And the local receptive field of traditional convolutional neural network (CNN) makes a single cell that cannot cover the whole text region in an image. Due to these issues, the existing recognition networks cannot extract the global context information in a visual scene. To deal with the above problems, we propose a Multi-head Self-relation Network(MSRN) for scene text recognition in this paper. The MSRN consists of several multihead self-relation layers, which are designed for extracting the global context information of a visual scene. Then the information of the related cells can be fused by multi-head self-relation layer. Furthermore, experiments over several public datasets demonstrate that our proposed recognition network achieves superior performance on several benchmark datasets including IC03, IC13, IC15, SVT-Perspective.
引用
收藏
页码:3969 / 3976
页数:8
相关论文
共 50 条
  • [21] Entity recognition of Chinese medical text based on multi-head self- attention combined with BILSTM-CRF
    Li, Chaofan
    Ma, Kai
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2022, 19 (03) : 2206 - 2218
  • [22] MHASAN: Multi-Head Angular Self Attention Network for Spoof Detection
    Hasan, Md
    Roy, Koushik
    Rupty, Labiba
    Hossain, Md. Sourave
    Sengupta, Shirshajit
    Taus, Shehzad Noor
    Mohammed, Nabeel
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 154 - 160
  • [23] Multi-head enhanced self-attention network for novelty detection
    Zhang, Yingying
    Gong, Yuxin
    Zhu, Haogang
    Bai, Xiao
    Tang, Wenzhong
    PATTERN RECOGNITION, 2020, 107
  • [24] Personalized multi-head self-attention network for news recommendation
    Zheng, Cong
    Song, Yixuan
    NEURAL NETWORKS, 2025, 181
  • [25] Review network for scene text recognition
    Li, Shuohao
    Han, Anqi
    Chen, Xu
    Yin, Xiaoqing
    Zhang, Jun
    JOURNAL OF ELECTRONIC IMAGING, 2017, 26 (05)
  • [26] Attention induced multi-head convolutional neural network for human activity recognition
    Khan, Zanobya N.
    Ahmad, Jamil
    APPLIED SOFT COMPUTING, 2021, 110
  • [27] Multi-Head Attention Affinity Diversity Sharing Network for Facial Expression Recognition
    Zheng, Caixia
    Liu, Jiayu
    Zhao, Wei
    Ge, Yingying
    Chen, Wenhe
    ELECTRONICS, 2024, 13 (22)
  • [28] MEAN: Multi-Element Attention Network for Scene Text Recognition
    Yan, Ruijie
    Peng, Liangrui
    Xiao, Shanyu
    Yao, Gang
    Min, Jaesik
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6850 - 6857
  • [29] A HYBRID TEXT NORMALIZATION SYSTEM USING MULTI-HEAD SELF-ATTENTION FOR MANDARIN
    Zhang, Junhui
    Pan, Junjie
    Yin, Xiang
    Li, Chen
    Liu, Shichao
    Zhang, Yang
    Wang, Yuxuan
    Ma, Zejun
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6694 - 6698
  • [30] Local Multi-Head Channel Self-Attention for Facial Expression Recognition
    Pecoraro, Roberto
    Basile, Valerio
    Bono, Viviana
    INFORMATION, 2022, 13 (09)