A comparative approach on detecting multi-lingual and multi-oriented text in natural scene images

被引:3
|
作者
Yegnaraman, Aparna [1 ]
Valli, S. [1 ]
机构
[1] Anna Univ, Coll Engn, Dept Comp Sci & Engn, Chennai 600025, Tamil Nadu, India
关键词
Scene text detection; PIoU loss; Genetic algorithm; You only look once; Differentiable binarization; Flexible threshold; LOCALIZATION; RECOGNITION;
D O I
10.1007/s10489-020-01972-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text helps to convey the intended message to users very accurately. Detecting text from natural scene images for quadrilateral-type and polygon-type datasets is the primary scope of this work. A regression-based method using modified You Only Look Once YOLOv4 network is used for quadrilateral-type datasets. Hyperparameters for training the network are optimized using the Genetic Algorithm which proves to be a suitable candidate than traditional methods. The Pixels-IoU (PIoU) loss is introduced to derive an accurate bounding box and it seems to be productive under various challenging scenarios with high aspect ratios and complex background. This yielded quick results for quadrilateral-type datasets but did not scale for arbitrarily-shaped and curved scene text. So the approach is changed to segmentation based for enhancing the results. This introduces binarization operation in a segmentation network to boost its detection accuracy for polygon-type datasets. The introduction of a new module DiffBiSeg (Differentiable Binarization in Segmentation network) facilitates post-processing and text detection performance by setting the thresholds flexibly for binarization in the segmentation network. The efficacy of both approaches is clearly seen in their respective experimental results.
引用
收藏
页码:3696 / 3717
页数:22
相关论文
共 50 条
  • [41] Multi-oriented English text line identification
    Pal, U
    Sinha, S
    Chaudhuri, BB
    IMAGE ANALYSIS, PROCEEDINGS, 2003, 2749 : 1146 - 1153
  • [42] Recognition of Indian multi-oriented and curved text
    Pal, U
    Tripathy, N
    EIGHTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 141 - 145
  • [43] A general approach for multi-oriented text line extraction of handwritten documents
    Ouwayed, Nazih
    Belaid, Abdel
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2012, 15 (04) : 297 - 314
  • [44] Multi-oriented Bangla and Devnagari text recognition
    Pal, Umapada
    Roy, Partha Pratim
    Tripathy, Nilamadhaba
    Llados, Josep
    PATTERN RECOGNITION, 2010, 43 (12) : 4124 - 4136
  • [45] A Novel Multi-Oriented Chinese Text Extraction Approach from Videos
    Liu, Yang
    Song, Yonghong
    Zhang, Yuanlin
    Meng, Quan
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 1355 - 1359
  • [46] A general approach for multi-oriented text line extraction of handwritten documents
    Nazih Ouwayed
    Abdel Belaïd
    International Journal on Document Analysis and Recognition (IJDAR), 2012, 15 : 297 - 314
  • [47] Multi-Oriented Text Extraction in Stylistic Documents
    Singh, Brij Mohan
    Sharma, Rahul
    Ghosh, Debashis
    Mittal, Ankush
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2015, 15 (01)
  • [48] A Deep Learning Approach for Robust, Multi-oriented, and Curved Text Detection
    Ranjbarzadeh, Ramin
    Jafarzadeh Ghoushchi, Saeid
    Anari, Shokofeh
    Safavi, Sadaf
    Tataei Sarshar, Nazanin
    Babaee Tirkolaee, Erfan
    Bendechache, Malika
    COGNITIVE COMPUTATION, 2024, 16 (04) : 1979 - 1991
  • [49] ReaderBench: A Multi-lingual Framework for Analyzing Text Complexity
    Dascalu, Mihai
    Gutu, Gabriel
    Ruseti, Stefan
    Paraschiv, Ionut Cristian
    Dessus, Philippe
    McNamara, Danielle S.
    Crossley, Scott A.
    Trausan-Matu, Stefan
    DATA DRIVEN APPROACHES IN DIGITAL EDUCATION, 2017, 10474 : 495 - 499
  • [50] Design consideration for multi-lingual cascading text compressors
    Chi, CH
    Zhang, Y
    DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 520 - 520