Residual attention-based multi-scale script identification in scene text images

被引:0
|
作者
Ma M. [1 ]
Wang Q.-F. [1 ]
Huang S. [2 ]
Huang S. [2 ]
Goulermas Y. [3 ]
Huang K. [1 ]
机构
[1] Department of Intelligent Science, School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou
[2] Tencent Technology Co. Ltd, Beijing
[3] Department of Computer Science, University of Liverpool, Liverpool
基金
中国国家自然科学基金;
关键词
Attention mechanism; Feature fusion; Global max pooling; Multi-scale features; Script identification;
D O I
10.1016/j.neucom.2020.09.015
中图分类号
学科分类号
摘要
Script identification is an essential step in the text extraction pipeline for multi-lingual application. This paper presents an effective approach to identify scripts in scene text images. Due to the complicated background, various text styles, character similarity of different languages, script identification has not been solved yet. Under the general classification framework of script identification, we investigate two important components: feature extraction and classification layer. In the feature extraction, we utilize a hierarchical feature fusion block to extract the multi-scale features. Furthermore, we adopt an attention mechanism to obtain the local discriminative parts of feature maps. In the classification layer, we utilize a fully convolutional classifier to generate channel-level classifications which are then processed by a global pooling layer to improve classification efficiency. We evaluated the proposed approach on benchmark datasets of RRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, and the experimental results show the effectiveness of each elaborate designed component. Finally, we achieve better performances than those competitive models, where the correct rates are 89.66%, 96.11%, 98.78% and 97.20% on PRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, respectively. © 2020 Elsevier B.V.
引用
收藏
页码:222 / 233
页数:11
相关论文
共 50 条
  • [21] AAANE: Attention-Based Adversarial Autoencoder for Multi-scale Network Embedding
    Sang, Lei
    Xu, Min
    Qian, Shengsheng
    Wu, Xindong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT III, 2019, 11441 : 3 - 14
  • [22] MADC: Multi-scale Attention-based Deep Clustering for Workload Prediction
    Huang, Jiaming
    Xiao, Chuming
    Wu, Weigang
    Yin, Ye
    Chang, Hongli
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 316 - 323
  • [23] SCENE TEXT DETECTION BASED ON MULTI-SCALE SWT AND EDGE FILTERING
    Feng, Yuanyuan
    Song, Yonghong
    YualinZhang
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 645 - 650
  • [24] Multi-Scale Scene Text Detection Based on Convolutional Neural Network
    Lu, Yan-Feng
    Zhang, Ai-Xuan
    Li, Yi
    Yu, Qian-Hui
    Qiao, Hong
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 583 - 587
  • [25] Attention-Based Multi-Scale Convolutional Neural Network (A plus MCNN) for Multi-Class Classification in Road Images
    Eslami, Elham
    Yun, Hae-Bum
    SENSORS, 2021, 21 (15)
  • [26] Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
    Butt, Hanan
    Raza, Muhammad Raheel
    Ramzan, Muhammad Javed
    Ali, Muhammad Junaid
    Haris, Muhammad
    FORECASTING, 2021, 3 (03): : 520 - 540
  • [27] Multi-scale wavelet texture-based script identification method
    Zeng, Li
    Tang, Yuanyan
    Chen, Tinghuai
    Jisuanji Xuebao/Chinese Journal of Computers, 2000, 23 (07): : 699 - 704
  • [28] A Multi-Scale Natural Scene Text Detection Method Based on Attention Feature Extraction and Cascade Feature Fusion
    Li, Nianfeng
    Wang, Zhenyan
    Huang, Yongyuan
    Tian, Jia
    Li, Xinyuan
    Xiao, Zhiguo
    SENSORS, 2024, 24 (12)
  • [29] Attention-Based Scene Text Detection on Dual Feature Fusion
    Li, Yuze
    Silamu, Wushour
    Wang, Zhenchao
    Xu, Miaomiao
    SENSORS, 2022, 22 (23)
  • [30] Adaptive embedding gate for attention-based scene text recognition
    Chen, Xiaoxue
    Wang, Tianwei
    Zhu, Yuanzhi
    Jin, Lianwen
    Luo, Canjie
    NEUROCOMPUTING, 2020, 381 : 261 - 271