Scene Text Detection Based on Multi-Headed Self-Attention Using Shifted Windows

被引:3
|
作者
Huang, Baohua [1 ]
Feng, Xiaoru [1 ]
机构
[1] Guangxi Univ, Sch Comp & Elect Informat, Nanning 530004, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 06期
基金
中国国家自然科学基金;
关键词
scene text detection; multi-headed self-attention; shifted window; multi-oriented; multi-language;
D O I
10.3390/app13063928
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Scene text detection has become a popular topic in computer vision research. Most of the current research is based on deep learning, using Convolutional Neural Networks (CNNs) to extract the visual features of images. However, due to the limitations of convolution kernel size, CNNs can only extract local features of images with small perceptual fields, and they cannot obtain more global features. In this paper, to improve the accuracy of scene text detection, a feature enhancement module is added to the text detection model. This module acquires global features of an image by computing the multi-headed self-attention of the feature map. The improved model extracts local features using CNNs, while extracting global features through the feature enhancement module. The features extracted by both of these are then fused to ensure that visual features at different levels of the image are extracted. A shifted window is used in the calculation of the self-attention, which reduces the computational complexity from the second power of the input image width-height product to the first power. Experiments are conducted on the multi-oriented text dataset ICDAR2015 and the multi-language text dataset MSRA-TD500. Compared with the pre-improvement method DBNet, the F1-score improves by 0.5% and 3.5% on ICDAR2015 and MSRA-TD500, respectively, indicating the effectiveness of the model improvement.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Dynamic multi-headed self-attention and multiscale enhancement vision transformer for object detection
    Fang, Sikai
    Lu, Xiaofeng
    Huang, Yifan
    Sun, Guangling
    Liu, Xuefeng
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (25) : 67213 - 67229
  • [2] Semantic Segmentation Algorithm Based Multi-headed Self-attention for Tea Picking Points
    Song Y.
    Yang S.
    Zheng Z.
    Ning J.
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2023, 54 (09): : 297 - 305
  • [3] Multi-modal knowledge graphs representation learning via multi-headed self-attention
    Wang, Enqiang
    Yu, Qing
    Chen, Yelin
    Slamu, Wushouer
    Luo, Xukang
    [J]. INFORMATION FUSION, 2022, 88 : 78 - 85
  • [4] Prediction of Large-Scale Regional Evapotranspiration Based on Multi-Scale Feature Extraction and Multi-Headed Self-Attention
    Zheng, Xin
    Zhang, Sha
    Zhang, Jiahua
    Yang, Shanshan
    Huang, Jiaojiao
    Meng, Xianye
    Bai, Yun
    [J]. REMOTE SENSING, 2024, 16 (07)
  • [5] Self-attention based Text Knowledge Mining for Text Detection
    Wan, Qi
    Ji, Haoqin
    Shen, Linlin
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5979 - 5988
  • [6] An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism
    Xu, Nannan
    Cui, Xinze
    Wang, Xin
    Zhang, Wei
    Zhao, Tianyu
    [J]. MATHEMATICS, 2022, 10 (15)
  • [7] Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
    Gao, Peng
    Zhang, Xin-Yue
    Yang, Xiao-Li
    Ni, Jian-Cheng
    Wang, Fei
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 161 - 164
  • [8] A new Knowledge Inference Approach Based on Multi-headed Attention Mechanism
    Cai, Yichao
    Yang, Qingyu
    Chen, Wei
    Wang, Ge
    Liu, Taian
    Liu, Xinying
    [J]. 2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 825 - 829
  • [9] Using of Attention for Scene Text Detection
    Wang Y.
    Gu X.
    [J]. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2021, 33 (12): : 1908 - 1915
  • [10] MSAU-NET: ROAD EXTRACTION BASED ON MULTI-HEADED SELF-ATTENTION MECHANISM AND U-NET WITH HIGH RESOLUTION REMOTE SENSING IMAGES
    Yu, Hang
    Guo, Yuru
    Liu, Zhiheng
    Zhou, Suiping
    Li, Chenyang
    Zhang, Wenjie
    Qi, Wenjuan
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6898 - 6900