Attention-Based Scene Text Detection on Dual Feature Fusion

被引:3
|
作者
Li, Yuze [1 ]
Silamu, Wushour [1 ]
Wang, Zhenchao [1 ]
Xu, Miaomiao [1 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Xinjiang Multilingual Informat Technol Res Ctr, Xinjiang Multilingual Informat Technol Lab, Urumqi 830017, Peoples R China
基金
中国国家自然科学基金;
关键词
scene text detection; feature pyramid network; spatial attention; multi-scale feature fusion; differentiable binarization;
D O I
10.3390/s22239072
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Scene Text Detection via Deep Semantic Feature Fusion and Attention-based Refinement
    Song, Yu
    Cui, Yuanshun
    Han, Hu
    Shan, Shiguang
    Chen, Xilin
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3747 - 3752
  • [2] Feature Fusion for Scene Text Detection
    Zhu, Zhen
    Liao, Minghui
    Shi, Baoguang
    Bai, Xiang
    [J]. 2018 13TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS), 2018, : 193 - 198
  • [3] Attention-based acoustic feature fusion network for depression detection
    Xu, Xiao
    Wang, Yang
    Wei, Xinru
    Wang, Fei
    Zhang, Xizhe
    [J]. NEUROCOMPUTING, 2024, 601
  • [4] FEATURE FUSION NETWORK FOR SCENE TEXT DETECTION
    Cai, Chenqin
    Lv, Pin
    Su, Bing
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2755 - 2759
  • [5] A Multi-Scale Natural Scene Text Detection Method Based on Attention Feature Extraction and Cascade Feature Fusion
    Li, Nianfeng
    Wang, Zhenyan
    Huang, Yongyuan
    Tian, Jia
    Li, Xinyuan
    Xiao, Zhiguo
    [J]. SENSORS, 2024, 24 (12)
  • [6] Attention-Based Multiscale Feature Fusion for Efficient Surface Defect Detection
    Zhao, Yuhao
    Liu, Qing
    Su, Hu
    Zhang, Jiabin
    Ma, Hongxuan
    Zou, Wei
    Liu, Song
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 10
  • [7] Detection of Atrial Fibrillation based on Feature Fusion using Attention-based BiLSTM
    Xie, Weifang
    Chen, Cang
    Zhao, Ruijie
    Lu, Yu
    [J]. 2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [8] AB-LSTM: Attention-based Bidirectional LSTM Model for Scene Text Detection
    Liu, Zhandong
    Zhou, Wengang
    Li, Houqiang
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2019, 15 (04)
  • [9] DA-STD: Deformable Attention-Based Scene Text Detection in Arbitrary Shape
    Wu, Xing
    Qi, Yangyang
    Tang, Bin
    Liu, Hairan
    [J]. PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2021, : 102 - 106
  • [10] Adaptive embedding gate for attention-based scene text recognition
    Chen, Xiaoxue
    Wang, Tianwei
    Zhu, Yuanzhi
    Jin, Lianwen
    Luo, Canjie
    [J]. NEUROCOMPUTING, 2020, 381 : 261 - 271