Urdu-Text Detection and Recognition in Natural Scene Images Using Deep Learning

被引:30
|
作者
Arafat, Syed Yasser [1 ,2 ]
Iqbal, Muhammad Javed [1 ]
机构
[1] Univ Engn & Technol UET, Dept Comp Sci, Taxila 47080, Pakistan
[2] Mirpur Univ Sci & Technol MUST, Dept CS & IT, Mirpur 10250, Pakistan
关键词
Text recognition; Image recognition; Character recognition; Neural networks; Feature extraction; Machine learning; Streaming media; BLSTM; deep neural network; FasterRCNN; image classification; Nastalique; optical character recognition (OCR); regression residual neural network (RRNN); synthetic urdu text; text detection; two stream deep neural network (TSDNN); LIGATURE RECOGNITION; LOCALIZATION; MODEL;
D O I
10.1109/ACCESS.2020.2994214
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Urdu text is a cursive script and belongs to a non-Latin family of other cursive scripts like Arabic, Chinese, and Hindi. Urdu text poses a challenge for detection/localization from natural scene images, and consequently recognition of individual ligatures in scene images. In this paper, a methodology is proposed that covers detection, orientation prediction, and recognition of Urdu ligatures in outdoor images. As a first step, the custom FasterRCNN algorithm has been used in conjunction with well-known CNNs like Squeezenet, Googlenet, Resnetl8, and Resnet50 for detection and localization purposes for images of size 320 x 240 pixels. For ligature Orientation prediction, a custom Regression Residual Neural Network (RRNN) is trained/tested on datasets containing randomly oriented ligatures. Recognition of ligatures was done using Two Stream Deep Neural Network (TSDNN). In our experiments, five-set of datasets, containing 4.2K and 51K Urdu-text-embedded synthetic images were generated using the CLE annotation text to evaluate different tasks of detection, orientation prediction, and recognition of ligatures. These synthetic images contain 132, and 1600 unique ligatures corresponding to 4.2K and 51K images respectively, with 32 variations of each ligature (4-backgrounds and font 8-color variations). Also, 1094 real-world images containing more than 12k Urdu characters were used for TSDNN's evaluation. Finally, all four detectors were evaluated and used to compare them for their ability to detect/localize Urdu-text using average-precision (AP). Resnet50 features based FasterRCNN was found to be the winner detector with AP of.98. While Squeeznet, Googlenet, Resnetl8 based detectors had testing AP of.65, .88, and .87 respectively. RRNN achieved and accuracy of 79% and 99% for 4k and 51K images respectively. Similarly, for characters classification in ligatures, TSDNN attained a partial sequence recognition rate of 94.90% and 95.20% for 4k and 51K images respectively. Similarly, a partial sequence recognition rate of 76.60% attained for real world-images.
引用
收藏
页码:96787 / 96803
页数:17
相关论文
共 50 条
  • [1] A Database for Urdu Text Detection and Recognition in Natural Scene Images
    Chandio, Asghar Ali
    Leghari, Mehwish
    Memon, Mukhtiar Ahmed
    Leghari, Mehjabeen
    Jalbani, Akhtar Hussain
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2020, 39 (01) : 47 - 54
  • [2] Text detection and script identification in natural scene images using deep learning
    Khalil, Ashwaq
    Jarrah, Moath
    Al-Ayyoub, Mahmoud
    Jararweh, Yaser
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 91
  • [3] Deep learning for detection of text polarity in natural scene images
    Perepu, Pavan Kumar
    [J]. NEUROCOMPUTING, 2021, 431 : 1 - 6
  • [4] Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks
    Wu, Xianyu
    Luo, Chao
    Zhang, Qian
    Zhou, Jiliu
    Yang, Hao
    Li, Yulian
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 61 (01): : 289 - 300
  • [5] Urdu text in natural scene images: a new dataset and preliminary text detection
    Ali, Hazrat
    Iqbal, Khalid
    Mujtaba, Ghulam
    Fayyaz, Ahmad
    Bulbul, Mohammad Farhad
    Karam, Fazal Wahab
    Zahir, Ali
    [J]. PeerJ Computer Science, 2021, 7 : 1 - 17
  • [6] Urdu text in natural scene images: a new dataset and preliminary text detection
    Ali, Hazrat
    Iqbal, Khalid
    Mujtaba, Ghulam
    Fayyaz, Ahmad
    Bulbul, Mohammad Farhad
    Karam, Fazal Wahab
    Zahir, Ali
    [J]. PEERJ COMPUTER SCIENCE, 2021, 7
  • [7] Text Detection and Recognition in Natural Scene Images
    Pise, Amruta
    Ruikar, S. D.
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [8] Text Detection and Recognition in Natural Scene Images
    Huang, Xiaoming
    Shen, Tao
    Wang, Run
    Gao, Chenqiang
    [J]. PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON ESTIMATION, DETECTION AND INFORMATION FUSION ICEDIF 2015, 2015, : 44 - 49
  • [9] Research on the Text Detection and Recognition in Natural Scene Images
    Wei Zi-han
    Du Xiao-ping
    Cao Lei
    [J]. ELEVENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2019), 2020, 11373
  • [10] Scene Text Detection and Recognition: The Deep Learning Era
    Shangbang Long
    Xin He
    Cong Yao
    [J]. International Journal of Computer Vision, 2021, 129 : 161 - 184