A novel approach for detecting and correcting segmentation and recognition errors in Arabic OCR systems

被引:0
|
作者
Mostafa, K [1 ]
Shaheen, SI
Darwish, AM
Farag, I
机构
[1] Cairo Univ, Fac Comp & Informat, Dept Informat Technol, Giza 12613, Egypt
[2] Cairo Univ, Dept Comp Engn, Giza 12613, Egypt
[3] Cairo Univ, Inst Stat Studies & Res, Giza 12613, Egypt
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new approach for detecting and correcting segmentation and recognition errors in Arabic OCR systems. The approach is suitable for both typewritten script recognition systems. Errors detection is based on rules of the Arabic language and a morphology analyzer. This type of analysis has the advantage of limiting the size of the dictionary to a practical size. Thus, a complete dictionary for roots, which does not exceed 5641 roots, the morphological rules and all valid patterns can be kept in a moderate size file. Recognition channel characteristics are modeled using a set of probabilistic finite state machines. Contextual information is utilized in the form of transitional probabilities between letters of previously defined vocabulary (finite lexicon) and transitional probabilities of garbled text. The developed detection and correction modules have been incorporated as a post-processing phase in an Arabic handwritten cursive script recognition system. Experimental results show a considerable enhancement in performance.
引用
收藏
页码:530 / 539
页数:10
相关论文
共 50 条
  • [1] A Segmentation Free Approach to Arabic and Urdu OCR
    Sabbour, Nazly
    Shafait, Faisal
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL XX, 2013, 8658
  • [2] A Method of Chinese Text Detecting Errors Based on Recognition Errors by OCR
    Tian Zhuo
    Li Baicheng
    [J]. MODERN TECHNOLOGIES IN MATERIALS, MECHANICS AND INTELLIGENT SYSTEMS, 2014, 1049 : 1540 - 1543
  • [3] Combining methods for detecting and correcting semantic hidden errors in Arabic texts
    Zribi, Chiraz Ben Othmane
    Mejri, Hanene
    Ahmed, Mohamed Ben
    [J]. Computational Linguistics and Intelligent Text Processing, 2007, 4394 : 634 - 645
  • [4] Detecting and correcting automatic speech recognition errors with a new model
    Arslan, Recep Sinan
    BariSci, Necaattin
    Arici, Nursal
    Kocer, Sabri
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (05) : 2298 - 2311
  • [5] “Easy” meta-embedding for detecting and correcting semantic errors in Arabic documents
    Chiraz Ben Othmane Zribi
    [J]. Multimedia Tools and Applications, 2023, 82 : 21161 - 21175
  • [6] "Easy" meta-embedding for detecting and correcting semantic errors in Arabic documents
    Zribi, Chiraz Ben Othmane
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (14) : 21161 - 21175
  • [7] Open Challenge for Correcting Errors of Speech Recognition Systems
    Kubis, Marek
    Vetulani, Zygmunt
    Wypych, Mikolaj
    Zietkiewicz, Tomasz
    [J]. HUMAN LANGUAGE TECHNOLOGY: CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2019, 2022, 13212 : 322 - 337
  • [8] Embedded Learning Segmentation Approach for Arabic Speech Recognition
    Frihia, Hamza
    Bahi, Halima
    [J]. TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 383 - 390
  • [9] Recovering Segmentation Errors in Handwriting Recognition Systems
    De Stefano, Claudio
    Fontanella, Francesco
    Marcelli, Angelo
    Parziale, Antonio
    di Freca, Alessandra Scotto
    [J]. INTELLIGENT COMPUTING THEORIES AND APPLICATION, PT II, 2018, 10955 : 631 - 642
  • [10] The role of visual feedback in detecting and correcting typing errors: A signal detection approach
    Pinet, Svetlana
    Nozari, Nazbanou
    [J]. JOURNAL OF MEMORY AND LANGUAGE, 2021, 117