Scene text detection via extremal region based double threshold convolutional network classification

被引:9
|
作者
Zhu, Wei [1 ]
Lou, Jing [1 ]
Chen, Longtao [1 ]
Xia, Qingyuan [1 ]
Ren, Mingwu [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Jiangsu, Peoples R China
来源
PLOS ONE | 2017年 / 12卷 / 08期
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
READING TEXT; IMAGES; LOCALIZATION; RECOGNITION; REPRESENTATION; FACE;
D O I
10.1371/journal.pone.0182227
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, we present a robust text detection approach in natural images which is based on region proposal mechanism. A powerful low-level detector named saliency enhanced-MSER extended from the widely-used MSER is proposed by incorporating saliency detection methods, which ensures a high recall rate. Given a natural image, character candidates are extracted from three channels in a perception-based illumination invariant color space by saliency-enhanced MSER algorithm. A discriminative convolutional neural network (CNN) is jointly trained with multi-level information including pixel-level and character-level information as character candidate classifier. Each image patch is classified as strong text, weak text and non-text by double threshold filtering instead of conventional one-step classification, leveraging confident scores obtained via CNN. To further prune non-text regions, we develop a recursive neighborhood search algorithm to track credible texts from weak text set. Finally, characters are grouped into text lines using heuristic features such as spatial location, size, color, and stroke width. We compare our approach with several state-of-the-art methods, and experiments show that our method achieves competitive performance on public datasets ICDAR 2011 and ICDAR 2013.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Scene text detection using enhanced Extremal region and convolutional neural network
    Fatemeh Naiemi
    Vahid Ghods
    Hassan Khalesi
    Multimedia Tools and Applications, 2020, 79 : 27137 - 27159
  • [2] Scene text detection using enhanced Extremal region and convolutional neural network
    Naiemi, Fatemeh
    Ghods, Vahid
    Khalesi, Hassan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (37-38) : 27137 - 27159
  • [3] SCENE TEXT DETECTION WITH EXTREMAL REGION BASED CASCADED FILTERING
    Li, Gen
    Liu, Jie
    Zhang, Shuwu
    Zheng, Yang
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 2896 - 2900
  • [4] Text Proposals Based on Windowed Maximally Stable Extremal Region for Scene Text Detection
    Su, Feng
    Ding, Wenjun
    Wang, Lan
    Shan, Susu
    Xu, Hailiang
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 376 - 381
  • [5] Irregular Scene Text Detection Based on a Graph Convolutional Network
    Zhang, Shiyu
    Zhou, Caiying
    Li, Yonggang
    Zhang, Xianchao
    Ye, Lihua
    Wei, Yuanwang
    SENSORS, 2023, 23 (03)
  • [6] A Novel Scene Text Detection Algorithm Based On Convolutional Neural Network
    Ren, Xiaohang
    Chen, Kai
    Yang, Xiaokang
    Zhou, Yi
    He, Jianhua
    Sun, Jun
    2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,
  • [7] RECURRENT GLOBAL CONVOLUTIONAL NETWORK FOR SCENE TEXT DETECTION
    Mohanty, Sabyasachi
    Dutta, Tanima
    Gupta, Hari Prabhat
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2750 - 2754
  • [8] Text-Attentional Convolutional Neural Network for Scene Text Detection
    He, Tong
    Huang, Weilin
    Qiao, Yu
    Yao, Jian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (06) : 2529 - 2541
  • [9] Multi-Scale Scene Text Detection Based on Convolutional Neural Network
    Lu, Yan-Feng
    Zhang, Ai-Xuan
    Li, Yi
    Yu, Qian-Hui
    Qiao, Hong
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 583 - 587
  • [10] Natural Scene Text Detection Based on Deep Supervised Fully Convolutional Network
    Zhang, Nan
    Jin, Xiaoning
    Li, Xiaowei
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 439 - 448