The optical character recognition of Urdu-like cursive scripts

被引:94
|
作者
Naz, Saeeda [1 ]
Hayat, Khizar [1 ,4 ]
Razzak, Muhammad Imran [2 ]
Anwar, Muhammad Waqas [1 ]
Madani, Sajjad A. [1 ]
Khan, Samee U. [3 ]
机构
[1] COMSATS Inst Informat Technol, Abbottabad, Pakistan
[2] King Saud Abdulaziz Univ Hlth Sci, Riyadh, Saudi Arabia
[3] N Dakota State Univ, Fargo, ND 58108 USA
[4] Univ Nizwa, Birkat Al Mawz, Oman
关键词
Optical character recognition; Ligature; Character; SPOTTING BASED RETRIEVAL; FUZZY-LOGIC; ONLINE; DATABASE; OFFLINE; FEATURES;
D O I
10.1016/j.patcog.2013.09.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We survey the optical character recognition (OCR) literature with reference to the Urdu-like cursive scripts. In particular, the Urdu, Pushto, and Sindhi languages are discussed, with the emphasis being on the Nasta'liq and Naskh scripts. Before detaining the OCR works, the peculiarities of the Urdu-like scripts are outlined, which are followed by the presentation of the available text image databases. For the sake of clarity, the various attempts are grouped into three parts, namely: (a) printed, (b) handwritten, and (c) online character recognition. Within each part, the works are analyzed par rapport a typical OCR pipeline with an emphasis on the preprocessing, segmentation, feature extraction, classification, and recognition. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1229 / 1248
页数:20
相关论文
共 50 条
  • [21] Japanese Cursive Character Recognition for Efficient Transcription
    Ueki, Kazuya
    Kojima, Tomoka
    ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 402 - 406
  • [22] IMPROVED STATISTICAL FEATURES FOR CURSIVE CHARACTER RECOGNITION
    Saba, Tanzila
    Rehman, Amjad
    Sulong, Ghazali
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2011, 7 (09): : 5211 - 5224
  • [23] Cursive character recognition by learning vector quantization
    Camastra, F
    Vinciarelli, A
    PATTERN RECOGNITION LETTERS, 2001, 22 (6-7) : 625 - 629
  • [24] Cursive digit and character recognition on CEDAR database
    Singh, S
    Hewitt, M
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 569 - 572
  • [25] Online character recognition of handwritten cursive script
    Muthumani, I.
    Uma Kumari, C.R.
    International Journal of Computer Science Issues, 2012, 9 (3 3-2): : 352 - 354
  • [26] Offline Urdu Nastaleeq Optical Character Recognition Based on Stacked Denoising Autoencoder
    Ahmad, Ibrar
    Wang, Xiaojie
    Li, Ruifan
    Rasheed, Shahid
    CHINA COMMUNICATIONS, 2017, 14 (01) : 146 - 157
  • [27] Offline Urdu Nastaleeq Optical Character Recognition Based on Stacked Denoising Autoencoder
    Ibrar Ahmad
    Xiaojie Wang
    Ruifan Li
    Shahid Rasheed
    中国通信, 2017, 14 (01) : 146 - 157
  • [28] Automated compilation of Urdu poetry handwritten image datasets for optical character recognition
    Ijaz, Irtaza
    Namoun, Abdallah
    Aljohani, Nasser
    Alanazi, Meshari Huwaytim
    Alanazi, Mohammad N.
    Shuja, Junaid
    Humayun, Mohammad Ali
    METHODSX, 2025, 14
  • [29] Character recognition of Arabic and Latin scripts
    Hussain, F
    Cowell, J
    2000 IEEE INTERNATIONAL CONFERENCE ON INFORMATION VISUALISATION, PROCEEDINGS, 2000, : 51 - 56
  • [30] Rule based online Urdu character recognition
    Razzak, Muhammad Imran
    Yusaf, Rubiyah
    Husain, Syed Afaq
    Sher, Muhammad
    ICIC Express Letters, 2010, 4 (02): : 571 - 576