The optical character recognition of Urdu-like cursive scripts

被引:94
|
作者
Naz, Saeeda [1 ]
Hayat, Khizar [1 ,4 ]
Razzak, Muhammad Imran [2 ]
Anwar, Muhammad Waqas [1 ]
Madani, Sajjad A. [1 ]
Khan, Samee U. [3 ]
机构
[1] COMSATS Inst Informat Technol, Abbottabad, Pakistan
[2] King Saud Abdulaziz Univ Hlth Sci, Riyadh, Saudi Arabia
[3] N Dakota State Univ, Fargo, ND 58108 USA
[4] Univ Nizwa, Birkat Al Mawz, Oman
关键词
Optical character recognition; Ligature; Character; SPOTTING BASED RETRIEVAL; FUZZY-LOGIC; ONLINE; DATABASE; OFFLINE; FEATURES;
D O I
10.1016/j.patcog.2013.09.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We survey the optical character recognition (OCR) literature with reference to the Urdu-like cursive scripts. In particular, the Urdu, Pushto, and Sindhi languages are discussed, with the emphasis being on the Nasta'liq and Naskh scripts. Before detaining the OCR works, the peculiarities of the Urdu-like scripts are outlined, which are followed by the presentation of the available text image databases. For the sake of clarity, the various attempts are grouped into three parts, namely: (a) printed, (b) handwritten, and (c) online character recognition. Within each part, the works are analyzed par rapport a typical OCR pipeline with an emphasis on the preprocessing, segmentation, feature extraction, classification, and recognition. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1229 / 1248
页数:20
相关论文
共 50 条
  • [41] Off-Line Cursive Handwritten Tamil Character Recognition
    Kannan, R. Jagadeesh
    Prabhakar, R.
    Suresh, R. M.
    SECTECH: 2008 INTERNATIONAL CONFERENCE ON SECURITY TECHNOLOGY, PROCEEDINGS, 2008, : 159 - +
  • [42] Experimental analysis of the modified direction feature for cursive character recognition
    Liu, XY
    Blumenstein, M
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 353 - 358
  • [43] Review on OCR for Handwritten Indian Scripts Character Recognition
    Kumar, Munish
    Jindal, M. K.
    Sharma, R. K.
    ADVANCES IN DIGITAL IMAGE PROCESSING AND INFORMATION TECHNOLOGY, 2011, 205 : 268 - +
  • [44] Handwritten character recognition of popular south Indian scripts
    Pal, Umapada
    Sharma, Nabin
    Wakabayashi, Tetsushi
    Kimura, Fumitaka
    ARABIC AND CHINESE HANDWRITING RECOGNITION, 2008, 4768 : 251 - +
  • [45] Stroke-Based Data Augmentation for Enhancing Optical Character Recognition of Ancient Handwritten Scripts
    Ayyoob, M. P.
    Ilyas, P. Muhamed
    IEEE ACCESS, 2024, 12 : 186794 - 186802
  • [46] A Self Organizing Map Based Urdu Nasakh Character Recognition
    Hussain, Syed Afaq
    Zaman, Safdar
    Ayub, Muhammad
    ICET: 2009 INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES, PROCEEDINGS, 2009, : 267 - +
  • [47] Adapting multilingual vision language transformers for low-resource Urdu optical character recognition (OCR)
    Cheema M.D.A.
    Shaiq M.D.
    Mirza F.
    Kamal A.
    Naeem M.A.
    PeerJ Computer Science, 2024, 10 : 1 - 24
  • [48] Adapting multilingual vision language transformers for low-resource Urdu optical character recognition (OCR)
    Cheema, Musa Dildar Ahmed
    Shaiq, Mohammad Daniyal
    Mirza, Farhaan
    Kamal, Ali
    Naeem, M. Asif
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [49] Meta-feature based few-shot Siamese learning for Urdu optical character recognition
    Naseer, Asma
    Zafar, Kashif
    COMPUTATIONAL INTELLIGENCE, 2022, 38 (05) : 1707 - 1727
  • [50] Combining Offline and Online Preprocessing for Online Urdu Character Recognition
    Razzak, Muhammad Imran
    Hussain, Syed Afaq
    Sher, Muhammad
    Khan, Zeeshan Shafi
    IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 912 - +