Multi-orientation scene text detection with scale-guided regression

被引:6
|
作者
Liang, Min [1 ]
Hou, Jie-Bo [1 ]
Zhu, Xiaobin [1 ]
Yang, Chun [1 ]
Qin, Jingyan [2 ]
Yin, Xu-Cheng [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Mech Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene text detection; Classification; Regression;
D O I
10.1016/j.neucom.2021.07.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-orientation scene text detection methods generally contain two crucial components: regression prediction for text bounding boxes and classification prediction for text/non-text. However, these methods always regard classification prediction and regression prediction as two independent procedures, neglecting fully exploring their mutual relations. Based on this key observation, we propose an innovative Scale-Guided Regression Module (SRM), specially for multi-orientation scene text detection. Equipped with width-guided kernels and height-guided kernels of different sizes, our SRM can generate a series of scale feature maps of candidate texts by capturing their shape information in classification prediction. The scale feature maps are used to predict the width and height of candidate texts, which can serve as guides for regressing bounding boxes. In this way, the procedures of classification and regression can be coherently integrated. In addition, we adopt IoU loss to train our network and then integrate IoU loss and l(1)-smooth loss for fine-tuning. Extensive experiments on publicly available datasets demonstrate the state-of-the-art performance of our method. Notably, our method achieves significant improvement of performance on long texts, e.g., on MSRA-TD500, our method outperforms Basemodel with a great margin (4.86% in terms of Recall). (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:310 / 318
页数:9
相关论文
共 50 条
  • [41] Dorsal hand vein recognition based on Gabor multi-orientation fusion and Multi - scale HOG features
    Han, Tuo
    Wang, Zhiyong
    Yang, Xiaoping
    OPTICS IN HEALTH CARE AND BIOMEDICAL OPTICS VII, 2017, 0024
  • [42] Progressive Contour Regression for Arbitrary-Shape Scene Text Detection
    Dai, Pengwen
    Zhang, Sanyi
    Zhang, Hua
    Cao, Xiaochun
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7389 - 7398
  • [43] Simulation Study of Orthogonal Eddy Current Excitation Technique for Detection of Multi-Orientation Slit
    Saari, Mohd Mawardi
    Nadzri, Nurul A'in
    Zaini, Mohd Aufa Hadi Putera
    Sulaiman, Mohd Herwan
    Kiwa, Toshihiko
    2024 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING, ICSSE 2024, 2024,
  • [44] Sliding Line Point Regression for Shape Robust Scene Text Detection
    Zhu, Yixing
    Du, Jun
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3735 - 3740
  • [45] A multi-scale and multi-orientation image retrieval method based on rotation-invariant texture features
    Shao ZhenFeng
    Li DeRen
    Zhu XianQiang
    SCIENCE CHINA-INFORMATION SCIENCES, 2011, 54 (04) : 732 - 744
  • [46] A multi-scale and multi-orientation image retrieval method based on rotation-invariant texture features
    ZhenFeng Shao
    DeRen Li
    XianQiang Zhu
    Science China Information Sciences, 2011, 54 : 732 - 744
  • [47] Multi-Orientation Local Texture Features for Guided Attention-Based Fusion in Lung Nodule Classification
    Saihood, Ahmed
    Karshenas, Hossein
    Naghsh-Nilchi, Ahmad Reza
    IEEE ACCESS, 2023, 11 : 17555 - 17568
  • [48] Irregular scene text detection via attention guided border labeling
    Chen, Jie
    Lian, Zhouhui
    Wang, Yizhi
    Tang, Yingmin
    Xiao, Jianguo
    SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (12)
  • [49] Irregular scene text detection via attention guided border labeling
    Jie Chen
    Zhouhui Lian
    Yizhi Wang
    Yingmin Tang
    Jianguo Xiao
    Science China Information Sciences, 2019, 62
  • [50] Irregular scene text detection via attention guided border labeling
    Jie CHEN
    Zhouhui LIAN
    Yizhi WANG
    Yingmin TANG
    Jianguo XIAO
    Science China(Information Sciences), 2019, 62 (12) : 33 - 43