Arabic Word Recognition System for Historical Documents using Multiscale Representation Method

被引:0
|
作者
Elaiwat, Said [1 ]
Abu-Zanona, Marwan [2 ]
机构
[1] Jouf Univ, Coll Comp & Informat Sci, Dept Comp Sci, Sakakah 72441, Saudi Arabia
[2] Al Imam Mohammad IbnSaud Islamic Univ IMSIU, Coll Sharia & Islamic Studies Al Ahsaa, Dept Comp Sci, Al Ahsaa, Saudi Arabia
关键词
Word recognition; multiscale convexity concavity analysis; historical documents; dynamic time warping; HANDWRITING RECOGNITION; FEATURES;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the last decades, huge efforts have been made to develop automated handwriting recognition systems. The task of recognition usually involves several complex processes including image pre-processing, segmentation, features extracting and matching. This task usually gets harder by processing historical documents as they involve skews, document degradation and structure noise. Although, the success that has been achieved in English language, the recognition of handwritten Arabic still constitutes a major challenge for many reasons. The characteristic of Arabic language, as a Semitic language, differs from other languages (e.g., European languages) in several aspects such as complex structure, implicit characters, concatenation and, writing styles and direction. This work proposes a full recognition system for the task of word recognition from from Arabic historical documents. In the proposed system, a novel feature extraction method is presented to define robust features from Arabic words. Prior Feature extraction, each input image is pre-processed and segmented resulting in segmented words. After that, the features of each word/sub-word are defined based on Multiscale Convexity Concavity(MCC) analysis of contour word shape. For feature matching, a circular shift method is proposed to burn the computational cost instead of using traditional dynamic time warping (DTW) which exhibits high computational cost. Finally, the proposed algorithm has been evaluated under well-known dataset, namely, Ibn Sina, and showed high performance for historical documents with low computational cost.
引用
收藏
页码:823 / 830
页数:8
相关论文
共 50 条
  • [21] Handwritten Arabic and Roman word recognition using holistic approach
    Samir Malakar
    Samanway Sahoo
    Anuran Chakraborty
    Ram Sarkar
    Mita Nasipuri
    The Visual Computer, 2023, 39 : 2909 - 2932
  • [22] A segmentation-free word spotting method for historical printed documents
    Thomas Konidaris
    Anastasios L. Kesidis
    Basilis Gatos
    Pattern Analysis and Applications, 2016, 19 : 963 - 976
  • [23] Arabic isolated word recognition system using hybrid feature extraction techniques and neural network
    Boussaid, Lotfi
    Hassine, Mohamed
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (01) : 29 - 37
  • [24] An Automatic Method for Enhancing Character Recognition in Degraded Historical Documents
    Pereira e Silva, Gabriel
    Lins, Rafael Dueire
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 553 - 557
  • [25] Using Attributes for Word Spotting and Recognition in Polytonic Greek Documents
    Sfikas, Giorgos
    Giotis, Angelos P.
    Louloudis, Georgios
    Gatos, Basilis
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 686 - 690
  • [26] A knowledge-based recognition system for historical Mongolian documents
    Su, Xiangdong
    Gao, Guanglai
    Wei, Hongxi
    Bao, Feilong
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (03) : 221 - 235
  • [27] A knowledge-based recognition system for historical Mongolian documents
    Xiangdong Su
    Guanglai Gao
    Hongxi Wei
    Feilong Bao
    International Journal on Document Analysis and Recognition (IJDAR), 2016, 19 : 221 - 235
  • [28] Neuro-Markovian hybrid system for handwritten arabic word recognition
    Narima, Z
    Messaoud, R
    Mouldi, B
    ICECS 2003: PROCEEDINGS OF THE 2003 10TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS, VOLS 1-3, 2003, : 878 - 881
  • [29] Word spotting in historical printed documents using shape and sequence comparisons
    Khurshid, Khurram
    Faure, Claudie
    Vincent, Nicole
    PATTERN RECOGNITION, 2012, 45 (07) : 2598 - 2609
  • [30] Word spotting in historical documents using primitive codebook and dynamic programming
    Roy, Partha Pratim
    Rayar, Frederic
    Ramel, Jean-Yves
    IMAGE AND VISION COMPUTING, 2015, 44 : 15 - 28