Multi-orientation scene text detection with scale-guided regression

被引:6
|
作者
Liang, Min [1 ]
Hou, Jie-Bo [1 ]
Zhu, Xiaobin [1 ]
Yang, Chun [1 ]
Qin, Jingyan [2 ]
Yin, Xu-Cheng [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[2] Univ Sci & Technol Beijing, Sch Mech Engn, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene text detection; Classification; Regression;
D O I
10.1016/j.neucom.2021.07.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing multi-orientation scene text detection methods generally contain two crucial components: regression prediction for text bounding boxes and classification prediction for text/non-text. However, these methods always regard classification prediction and regression prediction as two independent procedures, neglecting fully exploring their mutual relations. Based on this key observation, we propose an innovative Scale-Guided Regression Module (SRM), specially for multi-orientation scene text detection. Equipped with width-guided kernels and height-guided kernels of different sizes, our SRM can generate a series of scale feature maps of candidate texts by capturing their shape information in classification prediction. The scale feature maps are used to predict the width and height of candidate texts, which can serve as guides for regressing bounding boxes. In this way, the procedures of classification and regression can be coherently integrated. In addition, we adopt IoU loss to train our network and then integrate IoU loss and l(1)-smooth loss for fine-tuning. Extensive experiments on publicly available datasets demonstrate the state-of-the-art performance of our method. Notably, our method achieves significant improvement of performance on long texts, e.g., on MSRA-TD500, our method outperforms Basemodel with a great margin (4.86% in terms of Recall). (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:310 / 318
页数:9
相关论文
共 50 条
  • [31] Natural scene text detection by multi-scale adaptive color clustering and non-text filtering
    Wu, Hui
    Zou, Beiji
    Zhao, Yu-Qian
    Chen, Zailiang
    Zhu, Chengzhang
    Guo, Jianjing
    NEUROCOMPUTING, 2016, 214 : 1011 - 1025
  • [32] Rotation-sensitive Regression for Oriented Scene Text Detection
    Liao, Minghui
    Zhu, Zhen
    Shi, Baoguang
    Xia, Gui-song
    Bai, Xiang
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5909 - 5918
  • [33] Face detection algorithm based on multi-orientation gabor filters and feature fusion
    Lin, C. (gxustlc@126.com), 1600, Universitas Ahmad Dahlan, Jalan Kapas 9, Semaki, Umbul Harjo,, Yogiakarta, 55165, Indonesia (11):
  • [34] Representation and detection of multiscale, multi-orientation fields using local differentiation filters
    Shizawa, M
    Iso, T
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1996, 79 (07): : 102 - 113
  • [35] Scale modeling of thermo-structural fire tests of multi-orientation wood laminates
    Gangi, Michael J.
    Lattimer, Brian Y.
    Case, Scott W.
    WOOD SCIENCE AND TECHNOLOGY, 2024, 58 (04) : 1285 - 1322
  • [36] Deep Multi-Scale Context Aware Feature Aggregation for Curved Scene Text Detection
    Dai, Pengwen
    Zhang, Hua
    Cao, Xiaochun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (08) : 1969 - 1984
  • [37] High-speed Scene Text Detection with Attention and Multi-scale Label Generation
    Yanzhao Wang
    Xiaodong Gu
    Neural Processing Letters, 2023, 55 : 3967 - 3983
  • [38] MULTI-ORIENTED TEXT DETECTION IN SCENE IMAGES
    Basavanna, M.
    Shivakumara, P.
    Srivatsa, S. K.
    Kumar, G. Hemantha
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (07)
  • [39] High-speed Scene Text Detection with Attention and Multi-scale Label Generation
    Wang, Yanzhao
    Gu, Xiaodong
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 3967 - 3983
  • [40] A Scene Tibetan Text Detection by Combining Multi-scale and Dual-Channel Features
    Dangzhi, Cairang
    Huang, Heming
    Fan, Yonghong
    Fan, Yutao
    NEXT GENERATION DATA SCIENCE, SDSC 2023, 2024, 2113 : 158 - 171