Optical character recognition of handwritten Arabic using hidden Markov models

被引:1
|
作者
Aulama, Mohannad M. [1 ]
Natsheh, Asem M. [1 ]
Abandah, Gheith A. [1 ]
Olama, Mohammed M. [2 ]
机构
[1] Univ Jordan, Dept Comp Engn, Amman 11942, Jordan
[2] CSED, Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
来源
关键词
Character recognition; OCR; Arabic OCR; hidden Markov models (HMMs); Viterbi algorithm;
D O I
10.1117/12.884087
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of optical character recognition (OCR) of handwritten Arabic has not received a satisfactory solution yet. In this paper, an Arabic OCR algorithm is developed based on Hidden Markov Models (HMMs) combined with the Viterbi algorithm, which results in an improved and more robust recognition of characters at the sub-word level. Integrating the HMMs represents another step of the overall OCR trends being currently researched in the literature. The proposed approach exploits the structure of characters in the Arabic language in addition to their extracted features to achieve improved recognition rates. Useful statistical information of the Arabic language is initially extracted and then used to estimate the probabilistic parameters of the mathematical HMM. A new custom implementation of the HMM is developed in this study, where the transition matrix is built based on the collected large corpus, and the emission matrix is built based on the results obtained via the extracted character features. The recognition process is triggered using the Viterbi algorithm which employs the most probable sequence of sub-words. The model was implemented to recognize the sub-word unit of Arabic text raising the recognition rate from being linked to the worst recognition rate for any character to the overall structure of the Arabic language. Numerical results show that there is a potentially large recognition improvement by using the proposed algorithms.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Duration Models for Arabic Text Recognition using Hidden Markov Models
    Slimane, Fouad
    Ingold, Rolf
    Alimi, Adel M.
    Hennebert, Jean
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 838 - +
  • [22] Handwritten Nushu Character Recognition Based on Hidden Markov Model
    Wang, Jiangqing
    Zhu, Rongbo
    JOURNAL OF COMPUTERS, 2010, 5 (05) : 663 - 670
  • [23] Hidden Markov Model based Tamil handwritten character recognition
    Dept. of Computer Science and Engineering, National Engineering College, Kovilpatti-628 503, India
    不详
    不详
    Adv Model Anal B, 2007, 3-4 (1-14):
  • [24] Online handwritten shape recognition using segmental hidden Markov models
    Artieres, Thierry
    Marukatat, Sanparith
    Gallinari, Patrick
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (02) : 205 - 217
  • [25] Handwritten and Typewritten Text Identification and Recognition using Hidden Markov Models
    Cao, Huaigu
    Prasad, Rohit
    Natarajan, Prem
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 744 - 748
  • [26] A Survey on Arabic Optical Character Recognition and an Isolated Handwritten Arabic Character Recognition Algorithm using Encoded Freeman Chain Code
    Althobaiti, Hassan
    Lu, Chao
    2017 51ST ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2017,
  • [27] Early Handwritten Music Recognition with Hidden Markov Models
    Calvo-Zaragoza, Jorge
    Toselli, Alejandro H.
    Vidal, Enrique
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 319 - 324
  • [28] Off-line unconstrained handwritten numeral character recognition with multiple hidden Markov models
    Namane, A
    Arezki, M
    Guessoum, A
    Soubari, E
    Meyrueis, P
    Bruynooghe, M
    PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON VISUALIZATION, IMAGING, AND IMAGE PROCESSING, 2004, : 269 - 276
  • [29] Arabic calligraphy, typewritten and handwritten using optical character recognition (OCR) system
    Al-Barhamtoshy, Hassanin M.
    Jambi, Kamal M.
    Ahmed, Hany
    Mohamed, Shaimaa
    Abdo, Sherif M.
    Rashwan, Mohsen A.
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2019, 12 (02): : 283 - 296
  • [30] Recognition of writer-independent off-line handwritten Arabic (Indian) numerals using hidden Markov models
    Mahmoud, Sabri
    SIGNAL PROCESSING, 2008, 88 (04) : 844 - 857