A novel approach for detecting and correcting segmentation and recognition errors in Arabic OCR systems

被引:0
|
作者
Mostafa, K [1 ]
Shaheen, SI
Darwish, AM
Farag, I
机构
[1] Cairo Univ, Fac Comp & Informat, Dept Informat Technol, Giza 12613, Egypt
[2] Cairo Univ, Dept Comp Engn, Giza 12613, Egypt
[3] Cairo Univ, Inst Stat Studies & Res, Giza 12613, Egypt
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new approach for detecting and correcting segmentation and recognition errors in Arabic OCR systems. The approach is suitable for both typewritten script recognition systems. Errors detection is based on rules of the Arabic language and a morphology analyzer. This type of analysis has the advantage of limiting the size of the dictionary to a practical size. Thus, a complete dictionary for roots, which does not exceed 5641 roots, the morphological rules and all valid patterns can be kept in a moderate size file. Recognition channel characteristics are modeled using a set of probabilistic finite state machines. Contextual information is utilized in the form of transitional probabilities between letters of previously defined vocabulary (finite lexicon) and transitional probabilities of garbled text. The developed detection and correction modules have been incorporated as a post-processing phase in an Arabic handwritten cursive script recognition system. Experimental results show a considerable enhancement in performance.
引用
收藏
页码:530 / 539
页数:10
相关论文
共 50 条
  • [11] The role of visual feedback in detecting and correcting typing errors: A signal detection approach
    Pinet, Svetlana
    Nozari, Nazbanou
    [J]. JOURNAL OF MEMORY AND LANGUAGE, 2021, 117
  • [12] An Efficient Hybrid Approach to Detecting and Correcting Auxiliary Word Errors in Chinese Text
    Cao, Yang
    Wang, Shi
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, 2021, 12817 : 27 - 40
  • [13] A segmentation-free approach to text recognition with application to Arabic text
    Al-Badr B.
    Haralick R.M.
    [J]. International Journal on Document Analysis and Recognition, 1998, 1 (3) : 147 - 166
  • [14] A segmentation-free approach to text recognition with application to Arabic text
    Department of Computer Science and Engineering, University of Washington, Mail Stop FR-35, Seattle, WA 98195, United States
    [J]. Int. J. Doc. Anal. Recogn, 3 (147-166):
  • [15] A New Approach for Segmentation and Recognition of Arabic Handwritten Touching Numeral Pairs
    Alamri, Huda
    He, Chun Lei
    Suen, Ching Y.
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2009, 5702 : 165 - 172
  • [16] A novel approach for improving recognition accuracies in OCR of printed Telugu text
    Lakshmi, CV
    Patvardhan, C
    Prasad, M
    [J]. 2004 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING & COMMUNICATIONS (SPCOM), 2004, : 255 - 259
  • [17] Correcting word segmentation and part-of-speech tagging errors for Chinese named entity recognition
    Yao, TF
    Wei, D
    Erbach, G
    [J]. INTERNET CHALLENGE: TECHNOLOGY AND APPLICATIONS, 2002, : 29 - 36
  • [18] A novel vector quantization approach to arabic character recognition
    Sarhan, Ahmad M.
    Al Helalat, Omar I.
    [J]. WORLD CONGRESS ON ENGINEERING 2007, VOLS 1 AND 2, 2007, : 679 - 684
  • [19] A Novel Approach to Printed Arabic Optical Character Recognition
    Mansoor A. Al Ghamdi
    [J]. Arabian Journal for Science and Engineering, 2022, 47 : 2219 - 2237
  • [20] A Novel Hybrid Approach to Arabic Named Entity Recognition
    Meselhi, Mohamed A.
    Bakr, Hitham M. Abo
    Ziedan, Ibrahim
    Shaalan, Khaled
    [J]. MACHINE TRANSLATION, CWMT 2014, 2014, 493 : 93 - 103