Multi-orientation scene text detection with scale-guided regression

被引:6
|
作者
Liang, Min [1 ]
Hou, Jie-Bo [1 ]
Zhu, Xiaobin [1 ]
Yang, Chun [1 ]
Qin, Jingyan [2 ]
Yin, Xu-Cheng [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Mech Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene text detection; Classification; Regression;
D O I
10.1016/j.neucom.2021.07.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-orientation scene text detection methods generally contain two crucial components: regression prediction for text bounding boxes and classification prediction for text/non-text. However, these methods always regard classification prediction and regression prediction as two independent procedures, neglecting fully exploring their mutual relations. Based on this key observation, we propose an innovative Scale-Guided Regression Module (SRM), specially for multi-orientation scene text detection. Equipped with width-guided kernels and height-guided kernels of different sizes, our SRM can generate a series of scale feature maps of candidate texts by capturing their shape information in classification prediction. The scale feature maps are used to predict the width and height of candidate texts, which can serve as guides for regressing bounding boxes. In this way, the procedures of classification and regression can be coherently integrated. In addition, we adopt IoU loss to train our network and then integrate IoU loss and l(1)-smooth loss for fine-tuning. Extensive experiments on publicly available datasets demonstrate the state-of-the-art performance of our method. Notably, our method achieves significant improvement of performance on long texts, e.g., on MSRA-TD500, our method outperforms Basemodel with a great margin (4.86% in terms of Recall). (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:310 / 318
页数:9
相关论文
共 50 条
  • [1] Multi-Orientation Scene Text Detection with Adaptive Clustering
    Yin, Xu-Cheng
    Pei, Wei-Yi
    Zhang, Jun
    Hao, Hong-Wei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (09) : 1930 - 1937
  • [2] Multi-Orientation Scene Text Detection with Multi-Information Fusion
    Pei, Wei-Yi
    Yang, Chun
    Kau, Lih-Jen
    Yin, Xu-Cheng
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 657 - 662
  • [3] Multi-orientation Scene Text Detection Leveraging Background Suppression
    Wang, Xihan
    Feng, Xiaoyi
    Xia, Zhaoqiang
    Peng, Jinye
    Granger, Eric
    IMAGE AND GRAPHICS (ICIG 2017), PT I, 2017, 10666 : 555 - 566
  • [4] Anchor-free multi-orientation text detection in natural scene images
    Lu, Liqiong
    Wu, Dong
    Wu, Tao
    Huang, Faliang
    Yi, Yaohua
    APPLIED INTELLIGENCE, 2020, 50 (11) : 3623 - 3637
  • [5] Anchor-free multi-orientation text detection in natural scene images
    Liqiong Lu
    Dong Wu
    Tao Wu
    Faliang Huang
    Yaohua Yi
    Applied Intelligence, 2020, 50 : 3623 - 3637
  • [6] Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection
    Yuliang Liu
    Tong He
    Hao Chen
    Xinyu Wang
    Canjie Luo
    Shuaitao Zhang
    Chunhua Shen
    Lianwen Jin
    International Journal of Computer Vision, 2021, 129 : 1972 - 1992
  • [7] Exploring the Capacity of an Orderless Box Discretization Network for Multi-orientation Scene Text Detection
    Liu, Yuliang
    He, Tong
    Chen, Hao
    Wang, Xinyu
    Luo, Canjie
    Zhang, Shuaitao
    Shen, Chunhua
    Jin, Lianwen
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) : 1972 - 1992
  • [8] Tracking Based Multi-Orientation Scene Text Detection: A Unified Framework With Dynamic Programming
    Yang, Chun
    Yin, Xu-Cheng
    Pei, Wei-Yi
    Tian, Shu
    Zuo, Ze-Yu
    Zhu, Chao
    Yan, Junchi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) : 3235 - 3248
  • [9] Multi-Orientation Text Detection by Skeletonization (MOTDS)
    Azadboni, Mohammad Khodadadi
    Samadhiya, Aditi
    Khatri, Pallavi
    PROCEEDINGS OF 2014 2ND INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2014, : 5 - 9
  • [10] MSR: Multi-Scale Shape Regression for Scene Text Detection
    Xue, Chuhui
    Lu, Shijian
    Zhang, Wei
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 989 - 995