Natural Language Morphology Integration in Off-Line Arabic Optical Text Recognition

被引:12
|
作者
Kanoun, Slim [1 ]
Alimi, Adel M. [1 ]
Lecourtier, Yves [2 ]
机构
[1] Univ Sfax, Natl Sch Engineers, REGIM, Sfax 3038, Tunisia
[2] Univ Rouen, LITIS Lab, F-76800 St Etienne, France
关键词
Arabic text image; linguistic concepts of Arabic vocabulary; morphological characterization of word; off-line recognition; word categorization; MACHINE RECOGNITION; WORD RECOGNITION; HANDWRITTEN; SYSTEM;
D O I
10.1109/TSMCB.2010.2072990
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word hypotheses suggested by the OCR or in the course of the recognition process (recognition directed by a lexicon) using a statistical model of the language (Hidden Markov Model or N-gram). The proposed approach uses the linguistic concepts of the vocabulary to direct and simplify the recognition process. The principal contribution of the proposed approach is to be able to categorize the word hypotheses in words that are either derived or not derived from roots and to characterize morphologically each word hypothesis in order to prepare the text hypotheses for later analyses (for example, syntactic analysis; to filter the sentence hypotheses).
引用
收藏
页码:579 / 590
页数:12
相关论文
共 50 条
  • [41] Off-line recognition of handwritten Arabic words using multiple hidden Markov models
    Alma'adeed, S
    Higgins, C
    Elliman, D
    KNOWLEDGE-BASED SYSTEMS, 2004, 17 (2-4) : 75 - 79
  • [42] Off-line Handwriting Text Line Segmentation : A Review
    Razak, Zaidi
    Zulkiflee, Khansa
    Idris, Mohd Yamani Idna
    Tamil, Emran Mohd
    Noorzaily, Mohd
    Noor, Mohamed
    Salleh, Rosli
    Yaakob, Mohd
    Yusof, Zulkifli Mohd
    Yaacob, Mashkuri
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (07): : 12 - 20
  • [43] Recognition of off-line cursive handwriting
    Abuhaiba, ISI
    Holt, MJJ
    Datta, S
    COMPUTER VISION AND IMAGE UNDERSTANDING, 1998, 71 (01) : 19 - 38
  • [44] OFF-LINE CURSIVE WORD RECOGNITION
    SIMON, JC
    PROCEEDINGS OF THE IEEE, 1992, 80 (07) : 1150 - 1161
  • [45] A word level segmentation for off-line Arabic characters
    黄建华
    唐降龙
    Journal of Harbin Institute of Technology(New series), 2002, (04) : 391 - 396
  • [46] Word level segmentation for off-line Arabic characters
    Hassin, Abbas H.
    Huang, Jian-Hua
    Tang, Xiang-Long
    Journal of Harbin Institute of Technology (New Series), 2002, 9 (04) : 391 - 396
  • [47] RESTORATION OF TEMPORAL INFORMATION IN OFF-LINE ARABIC HANDWRITING
    ABUHAIBA, ISI
    AHMED, P
    PATTERN RECOGNITION, 1993, 26 (07) : 1009 - 1017
  • [48] Off-line Arabic handwriting recognition system based on ML-LPQ and classifiers combination
    Korichi, Aicha
    Aiadi, Oussama
    Khaldi, Belal
    Slatnia, Sihem
    Kherfi, Mohammed Lamine
    2018 INTERNATIONAL CONFERENCE ON SIGNAL, IMAGE, VISION AND THEIR APPLICATIONS (SIVA), 2018,
  • [49] Sliding window based off-line handwritten text recognition using edit distance
    Dey, Raghunath
    Balabantaray, Rakesh Chandra
    Mohanty, Sanghamitra
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (16) : 22761 - 22788
  • [50] Comparing natural and synthetic training data for off-line cursive handwriting recognition
    Varga, T
    Bunke, H
    NINTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION, PROCEEDINGS, 2004, : 221 - 225