Residual attention-based multi-scale script identification in scene text images

被引：0

作者：

Ma M. ^{[1
]}

Wang Q.-F. ^{[1
]}

Huang S. ^{[2
]}

Goulermas Y. ^{[3
]}

Huang K. ^{[1
]}

机构：

[1] Department of Intelligent Science, School of Advanced Technology, Xi'an Jiaotong-Liverpool University, Suzhou

[2] Tencent Technology Co. Ltd, Beijing

[3] Department of Computer Science, University of Liverpool, Liverpool

来源：

Neurocomputing | 2021年 / 421卷

基金：

中国国家自然科学基金;

关键词：

Attention mechanism; Feature fusion; Global max pooling; Multi-scale features; Script identification;

D O I：

10.1016/j.neucom.2020.09.015

中图分类号：

学科分类号：

摘要：

Script identification is an essential step in the text extraction pipeline for multi-lingual application. This paper presents an effective approach to identify scripts in scene text images. Due to the complicated background, various text styles, character similarity of different languages, script identification has not been solved yet. Under the general classification framework of script identification, we investigate two important components: feature extraction and classification layer. In the feature extraction, we utilize a hierarchical feature fusion block to extract the multi-scale features. Furthermore, we adopt an attention mechanism to obtain the local discriminative parts of feature maps. In the classification layer, we utilize a fully convolutional classifier to generate channel-level classifications which are then processed by a global pooling layer to improve classification efficiency. We evaluated the proposed approach on benchmark datasets of RRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, and the experimental results show the effectiveness of each elaborate designed component. Finally, we achieve better performances than those competitive models, where the correct rates are 89.66%, 96.11%, 98.78% and 97.20% on PRC-MLT2017, SIW-13, CVSI-2015 and MLe2e, respectively. © 2020 Elsevier B.V.

引用

页码：222 / 233

页数：11

共 50 条

[21] AAANE: Attention-Based Adversarial Autoencoder for Multi-scale Network Embedding
Sang, Lei
Xu, Min
Qian, Shengsheng
Wu, Xindong
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT III, 2019, 11441 : 3 - 14
[22] MADC: Multi-scale Attention-based Deep Clustering for Workload Prediction
Huang, Jiaming
Xiao, Chuming
Wu, Weigang
Yin, Ye
Chang, Hongli
19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 316 - 323
[23] SCENE TEXT DETECTION BASED ON MULTI-SCALE SWT AND EDGE FILTERING
Feng, Yuanyuan
Song, Yonghong
YualinZhang
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 645 - 650
[24] Multi-Scale Scene Text Detection Based on Convolutional Neural Network
Lu, Yan-Feng
Zhang, Ai-Xuan
Li, Yi
Yu, Qian-Hui
Qiao, Hong
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 583 - 587
[25] Attention-Based Multi-Scale Convolutional Neural Network (A plus MCNN) for Multi-Class Classification in Road Images
Eslami, Elham
Yun, Hae-Bum
SENSORS, 2021, 21 (15)
[26] Attention-Based CNN-RNN Arabic Text Recognition from Natural Scene Images
Butt, Hanan
Raza, Muhammad Raheel
Ramzan, Muhammad Javed
Ali, Muhammad Junaid
Haris, Muhammad
FORECASTING, 2021, 3 (03): : 520 - 540
[27] Multi-scale wavelet texture-based script identification method
Zeng, Li
Tang, Yuanyan
Chen, Tinghuai
Jisuanji Xuebao/Chinese Journal of Computers, 2000, 23 (07): : 699 - 704
[28] A Multi-Scale Natural Scene Text Detection Method Based on Attention Feature Extraction and Cascade Feature Fusion
Li, Nianfeng
Wang, Zhenyan
Huang, Yongyuan
Tian, Jia
Li, Xinyuan
Xiao, Zhiguo
SENSORS, 2024, 24 (12)
[29] Attention-Based Scene Text Detection on Dual Feature Fusion
Li, Yuze
Silamu, Wushour
Wang, Zhenchao
Xu, Miaomiao
SENSORS, 2022, 22 (23)
[30] Adaptive embedding gate for attention-based scene text recognition
Chen, Xiaoxue
Wang, Tianwei
Zhu, Yuanzhi
Jin, Lianwen
Luo, Canjie
NEUROCOMPUTING, 2020, 381 : 261 - 271

← 1 2 3 4 5 →