Residual attention-based multi-scale script identification in scene text images

被引:0
|
作者
Ma M. [1 ]
Wang Q.-F. [1 ]
Huang S. [2 ]
Huang S. [2 ]
Goulermas Y. [3 ]
Huang K. [1 ]
机构
[1] Department of Intelligent Science, School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou
[2] Tencent Technology Co. Ltd, Beijing
[3] Department of Computer Science, University of Liverpool, Liverpool
基金
中国国家自然科学基金;
关键词
Attention mechanism; Feature fusion; Global max pooling; Multi-scale features; Script identification;
D O I
10.1016/j.neucom.2020.09.015
中图分类号
学科分类号
摘要
Script identification is an essential step in the text extraction pipeline for multi-lingual application. This paper presents an effective approach to identify scripts in scene text images. Due to the complicated background, various text styles, character similarity of different languages, script identification has not been solved yet. Under the general classification framework of script identification, we investigate two important components: feature extraction and classification layer. In the feature extraction, we utilize a hierarchical feature fusion block to extract the multi-scale features. Furthermore, we adopt an attention mechanism to obtain the local discriminative parts of feature maps. In the classification layer, we utilize a fully convolutional classifier to generate channel-level classifications which are then processed by a global pooling layer to improve classification efficiency. We evaluated the proposed approach on benchmark datasets of RRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, and the experimental results show the effectiveness of each elaborate designed component. Finally, we achieve better performances than those competitive models, where the correct rates are 89.66%, 96.11%, 98.78% and 97.20% on PRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, respectively. © 2020 Elsevier B.V.
引用
收藏
页码:222 / 233
页数:11
相关论文
共 50 条
  • [31] WasteNet: A novel multi-scale attention-based U-Net architecture for waste detection in UAV images
    Bansal, Kamakhya
    Tripathi, Ashish Kumar
    REMOTE SENSING APPLICATIONS-SOCIETY AND ENVIRONMENT, 2024, 35
  • [32] Attention-based multi-scale feature fusion network for myopia grading using optical coherence tomography images
    Huang, Gengyou
    Wen, Yang
    Qian, Bo
    Bi, Lei
    Chen, Tingli
    Sheng, Bin
    VISUAL COMPUTER, 2024, 40 (09): : 6627 - 6638
  • [33] Aircraft segmentation in remote sensing images based on multi-scale residual U-Net with attention
    Xuqi Wang
    Shanwen Zhang
    Lei Huang
    Multimedia Tools and Applications, 2024, 83 : 17855 - 17872
  • [34] Aircraft segmentation in remote sensing images based on multi-scale residual U-Net with attention
    Wang, Xuqi
    Zhang, Shanwen
    Huang, Lei
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 17855 - 17872
  • [35] Text detection and script identification in natural scene images using deep learning
    Khalil, Ashwaq
    Jarrah, Moath
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 91
  • [36] Multi-Scale Attention-Based Deep Neural Network for Brain Disease Diagnosis
    Liang, Yin
    Xu, Gaoxu
    Rehman, Sadaqat Ur
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (03): : 4645 - 4661
  • [37] Video Object Segmentation Using Multi-Scale Attention-Based Siamese Network
    Zhu, Zhiliang
    Qiu, Leiningxin
    Wang, Jiaxin
    Xiong, Jinquan
    Peng, Hua
    ELECTRONICS, 2023, 12 (13)
  • [38] Attention-Based Spatialoral Multi-Scale Network for Face Anti-Spoofing
    Zheng W.
    Yue M.
    Zhao S.
    Liu S.
    IEEE Transactions on Biometrics, Behavior, and Identity Science, 2021, 3 (03): : 296 - 307
  • [39] Attention-Based Multi-Scale Prediction Network for Time-Series Data
    Li, Junjie
    Zhu, Lin
    Zhang, Yong
    Guo, Da
    Xia, Xingwen
    CHINA COMMUNICATIONS, 2022, 19 (05) : 286 - 301
  • [40] ATTENTION-BASED MULTI-SCALE GRAPH CONVOLUTION FOR POINT CLOUD SEMANTIC SEGMENTATION
    Akwensi, Perpetual Hope
    Wang, Ruisheng
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 7515 - 7518