Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition

被引:12
|
作者
Kanoun, Slim [1 ]
Alimi, Adel M. [1 ]
Lecourtier, Yves [2 ]
机构
[1] Univ Sfax, Natl Sch Engineers, REGIM, Sfax 3038, Tunisia
[2] Univ Rouen, LITIS Lab, F-76800 St Etienne, France
关键词
Arabic text image; linguistic concepts of Arabic vocabulary; morphological characterization of word; off-line recognition; word categorization; MACHINE RECOGNITION; WORD RECOGNITION; HANDWRITTEN; SYSTEM;
D O I
10.1109/TSMCB.2010.2072990
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word hypotheses suggested by the OCR or in the course of the recognition process (recognition directed by a lexicon) using a statistical model of the language (Hidden Markov Model or N-gram). The proposed approach uses the linguistic concepts of the vocabulary to direct and simplify the recognition process. The principal contribution of the proposed approach is to be able to categorize the word hypotheses in words that are either derived or not derived from roots and to characterize morphologically each word hypothesis in order to prepare the text hypotheses for later analyses (for example, syntactic analysis; to filter the sentence hypotheses).
引用
收藏
页码:579 / 590
页数:12
相关论文
共 50 条
  • [21] Shape-based Alphabet for Off-line Arabic Handwriting Recognition
    Menasri, F.
    Vincent, N.
    Augustin, E.
    Cheriet, M.
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 969 - +
  • [22] Combining diverse on-line and off-line systems for handwritten text line recognition
    Liwicki, Marcus
    Bunke, Horst
    PATTERN RECOGNITION, 2009, 42 (12) : 3254 - 3263
  • [23] Neural network language models for off-line handwriting recognition
    Zamora-Martinez, F.
    Frinken, V.
    Espana-Boquera, S.
    Castro-Bleda, M. J.
    Fischer, A.
    Bunke, H.
    PATTERN RECOGNITION, 2014, 47 (04) : 1642 - 1652
  • [24] Off-line Recognition Handwriting Arabic Words Using Combination of Multiple classifiers
    Maqqor, Ahlam
    Halli, Akram
    Satori, Khalid
    Tairi, Hamid
    2014 THIRD IEEE INTERNATIONAL COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY (CIST'14), 2014, : 260 - 265
  • [25] A System for off-line Arabic Handwritten Word Recognition based on Bayesian Approach
    Khemiri, Akram
    Echi, Afef Kacem
    Belaid, Abdel
    Elloumi, Mourad
    PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2016, : 560 - 565
  • [26] Recognition of Off-line Arabic Handwriting Using Hidden Markov Model Toolkit
    Xiang, Dong
    Liu, Hu
    Chen, Xianqiao
    Cheng, Yanfen
    Yao, Hanbing
    2012 11TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING & SCIENCE (DCABES), 2012, : 409 - 412
  • [27] An empirical evaluation of off-line arabic handwriting and printed characters recognition system
    Department of Computer Science and Information System, Jazan University, Jazan, Saudi Arabia
    Int. J. Comput. Sci. Issues, 2012, 6 6-1 (29-35):
  • [28] Recognition of Off-line Arabic Handwriting words Using HMM Toolkit (HTK)
    El Moubtahij, Hicham
    Satori, Khalid
    Halli, Akram
    2016 13TH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION (CGIV), 2016, : 167 - 171
  • [29] Effective Technique for the Recognition of Writer Independent Off-line Handwritten Arabic Words
    Azeem, Sherif Abdel
    Ahmed, Hany
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 594 - 599
  • [30] Off-line Arabic handwritten characters recognition based on a Hidden Markov Models
    Amrouch, M.
    Elyassa, M.
    Rachidi, A.
    Mammass, D.
    IMAGE AND SIGNAL PROCESSING, 2008, 5099 : 447 - 454