Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition

被引:12
|
作者
Kanoun, Slim [1 ]
Alimi, Adel M. [1 ]
Lecourtier, Yves [2 ]
机构
[1] Univ Sfax, Natl Sch Engineers, REGIM, Sfax 3038, Tunisia
[2] Univ Rouen, LITIS Lab, F-76800 St Etienne, France
关键词
Arabic text image; linguistic concepts of Arabic vocabulary; morphological characterization of word; off-line recognition; word categorization; MACHINE RECOGNITION; WORD RECOGNITION; HANDWRITTEN; SYSTEM;
D O I
10.1109/TSMCB.2010.2072990
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word hypotheses suggested by the OCR or in the course of the recognition process (recognition directed by a lexicon) using a statistical model of the language (Hidden Markov Model or N-gram). The proposed approach uses the linguistic concepts of the vocabulary to direct and simplify the recognition process. The principal contribution of the proposed approach is to be able to categorize the word hypotheses in words that are either derived or not derived from roots and to characterize morphologically each word hypothesis in order to prepare the text hypotheses for later analyses (for example, syntactic analysis; to filter the sentence hypotheses).
引用
收藏
页码:579 / 590
页数:12
相关论文
共 50 条
  • [1] A Comparative Study On Optical Modeling Units For Off-line Arabic Text Recognition
    BenZeghiba, Mohammed Faouzi
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1025 - 1030
  • [2] Recognition of off-line printed Arabic text using Hidden Markov Models
    Al-Muhtaseb, Husni A.
    Mahmoud, Sabri A.
    Qahwaji, Rami S.
    SIGNAL PROCESSING, 2008, 88 (12) : 2902 - 2912
  • [3] Off-line arabic signature recognition and verification
    Ismail, MA
    Gad, S
    PATTERN RECOGNITION, 2000, 33 (10) : 1727 - 1740
  • [4] Off-Line Arabic Character Recognition – A Review
    M. S. Khorsheed
    Pattern Analysis & Applications, 2002, 5 : 31 - 45
  • [5] OFF-LINE ARABIC CHARACTER-RECOGNITION
    GORAINE, H
    USHER, M
    ALEMAMI, S
    COMPUTER, 1992, 25 (07) : 71 - 74
  • [6] Off-Line Writer Recognition for Farsi Text
    Rafiee, Ali
    Motavalli, Hamidreza
    MICAI 2007: SIXTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, : 193 - 197
  • [7] Off-line Arabic character recognition: The state of the art
    Amin, A
    PATTERN RECOGNITION, 1998, 31 (05) : 517 - 530
  • [8] Writer adaptation in off-line Arabic handwriting recognition
    Ball, Gregory R.
    Srihari, Sargur N.
    DOCUMENT RECOGNITION AND RETRIEVAL XV, 2008, 6815
  • [9] MACHINE RECOGNITION OF PRINTED ARABIC TEXT UTILIZING NATURAL-LANGUAGE MORPHOLOGY
    AMIN, A
    ALFEDAGHI, S
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1991, 35 (06): : 769 - 788
  • [10] CRFs and HCRFs Based Recognition for Off-Line Arabic Handwriting
    Elzobi, Moftah
    Al-Hamadi, Ayoub
    Dings, Laslo
    El-etriby, Sherif
    ADVANCES IN VISUAL COMPUTING, PT II (ISVC 2015), 2015, 9475 : 337 - 346