ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene Text Detection

被引:5
|
作者
Qu, Yadong [1 ]
Xie, Hongtao [1 ]
Fang, Shancheng [1 ]
Wang, Yuxin [1 ]
Zhang, Yongdong [1 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230026, Peoples R China
关键词
Kernel; Shape; Costs; Convolution; Adaptive systems; Text recognition; Synthetic aperture sonar; Scene text detection; shrunk polygon; aspect ratio; adaptive dilation factor;
D O I
10.1109/TMM.2022.3216729
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To localize text regions and separate close instances, the shrunk polygon is widely used in recent scene text detection methods. However, there exist two problems: 1) Existing methods fail to consider the aspect ratio sensitive problem when reconstructing the text instance from shrunk polygon. 2) Texts with extreme aspect ratios will lead to the fracture of shrunk polygons. To handle these two problems, in this paper, we propose a novel Adaptive Dilation Network (ADNet) to focus on the reconstruction process from shrunk polygon, which aims to provide a tight and complete text representation. Firstly, instead of using a fixed dilation factor, ADNet uses an aspect ratio-wise dilation factor to reconstruct the text region from shrunk polygon for each text instance. Such an instance-wise dilation factor considers the scale correlation between the original and shrunk polygon, and thus can guide an adaptive text region reconstruction for texts with large aspect ratio variance. Secondly, to deal with the fracture of detection results, a new Efficient Spatial Relationship Module (ESRM) is devised to capture long-range dependencies with low computation cost. ESRM uses a novel Weighted Pooling to reduce the resolution of feature maps without much information loss. Compared with the existing methods, ADNet further explores the potential of shrunk polygon-based approaches and obtains excellent detection results at an impressive speed. Extensive experiments on several datasets (Total-Text, CTW1500, MSRA-TD500 and ICDAR2015) verify the superiority of our method.
引用
收藏
页码:6983 / 6996
页数:14
相关论文
共 50 条
  • [1] Scene Text Detection with Polygon Offsetting and Border Augmentation
    Kobchaisawat, Thananop
    Chalidabhongse, Thanarat H.
    Satoh, Shin'ichi
    ELECTRONICS, 2020, 9 (01)
  • [2] A polygon-based modeling approach to assess exposure of resources and assets to wildfire
    Thompson, Matthew P.
    Scott, Joe
    Kaiden, Jeffrey D.
    Gilbertson-Day, Julie W.
    NATURAL HAZARDS, 2013, 67 (02) : 627 - 644
  • [3] A polygon-based modeling approach to assess exposure of resources and assets to wildfire
    Matthew P. Thompson
    Joe Scott
    Jeffrey D. Kaiden
    Julie W. Gilbertson-Day
    Natural Hazards, 2013, 67 : 627 - 644
  • [4] POLYGON-FREE: UNCONSTRAINED SCENE TEXT DETECTION WITH BOX ANNOTATIONS
    Wu, Weijia
    Xie, Enze
    Zhang, Ruimao
    Wang, Wenhai
    Luo, Ping
    Zhou, Hong
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1226 - 1230
  • [5] Dynamic programming approach to optimal vertex selection for polygon-based shape approximation
    Choi, JG
    Lee, SM
    Kang, HS
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2003, 150 (05): : 287 - 291
  • [6] A polygon-based approach for matching OpenStreetMap road networks with regional transit authority data
    Fan, Hongchao
    Yang, Bisheng
    Zipf, Alexander
    Rousell, Adam
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2016, 30 (04) : 748 - 764
  • [7] Polygon-based image registration: a new approach for geo-referencing historical maps
    Yan, Wai Yeung
    Easa, Said M.
    Shaker, Ahmed
    REMOTE SENSING LETTERS, 2017, 8 (07) : 703 - 712
  • [8] Polygon-based approach for extracting multilane roads from OpenStreetMap urban road networks
    Li, Qiuping
    Fan, Hongchao
    Luan, Xuechen
    Yang, Bisheng
    Liu, Lin
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2014, 28 (11) : 2200 - 2219
  • [9] An Integrated Approach for Multilingual Scene Text Detection
    Liao, Wen-Hung
    Liang, Yi-Hsuan
    Wu, Yi-Chieh
    PROCEEDINGS OF THE 2015 SEVENTH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2015), 2015, : 211 - 217
  • [10] Scene Text Detection Based on Text Stroke Components
    Hou, Xinyue
    Cheng, Pengsen
    Gao, Hongyu
    Li, Xin
    Liu, Jiayong
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2025, 35 (05)