Arabic Word Recognition System for Historical Documents using Multiscale Representation Method

被引:0
|
作者
Elaiwat, Said [1 ]
Abu-Zanona, Marwan [2 ]
机构
[1] Jouf Univ, Coll Comp & Informat Sci, Dept Comp Sci, Sakakah 72441, Saudi Arabia
[2] Al Imam Mohammad IbnSaud Islamic Univ IMSIU, Coll Sharia & Islamic Studies Al Ahsaa, Dept Comp Sci, Al Ahsaa, Saudi Arabia
关键词
Word recognition; multiscale convexity concavity analysis; historical documents; dynamic time warping; HANDWRITING RECOGNITION; FEATURES;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the last decades, huge efforts have been made to develop automated handwriting recognition systems. The task of recognition usually involves several complex processes including image pre-processing, segmentation, features extracting and matching. This task usually gets harder by processing historical documents as they involve skews, document degradation and structure noise. Although, the success that has been achieved in English language, the recognition of handwritten Arabic still constitutes a major challenge for many reasons. The characteristic of Arabic language, as a Semitic language, differs from other languages (e.g., European languages) in several aspects such as complex structure, implicit characters, concatenation and, writing styles and direction. This work proposes a full recognition system for the task of word recognition from from Arabic historical documents. In the proposed system, a novel feature extraction method is presented to define robust features from Arabic words. Prior Feature extraction, each input image is pre-processed and segmented resulting in segmented words. After that, the features of each word/sub-word are defined based on Multiscale Convexity Concavity(MCC) analysis of contour word shape. For feature matching, a circular shift method is proposed to burn the computational cost instead of using traditional dynamic time warping (DTW) which exhibits high computational cost. Finally, the proposed algorithm has been evaluated under well-known dataset, namely, Ibn Sina, and showed high performance for historical documents with low computational cost.
引用
收藏
页码:823 / 830
页数:8
相关论文
共 50 条
  • [11] Improving Sentiment Analysis in Arabic Using Word Representation
    Alayba, Abdulaziz M.
    Palade, Vasile
    England, Matthew
    Iqbal, Rahat
    2018 IEEE 2ND INTERNATIONAL WORKSHOP ON ARABIC AND DERIVED SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2018, : 13 - 18
  • [12] A Novel Word Based Arabic Handwritten Recognition System Using SVM Classifier
    Khalifa, Mahmoud
    Yang BingRu
    ADVANCED RESEARCH ON ELECTRONIC COMMERCE, WEB APPLICATION, AND COMMUNICATION, PT 1, 2011, 143 : 163 - 171
  • [13] Learning-based word spotting system for Arabic handwritten documents
    Khayyat, Muna
    Lam, Louisa
    Suen, Ching Y.
    PATTERN RECOGNITION, 2014, 47 (03) : 1021 - 1030
  • [14] Word Sense Representation based-method for Arabic Text Categorization
    El-Alami, Fatima-Zahra
    Ouatik El Alaoui, Said
    9TH INTERNATIONAL SYMPOSIUM ON SIGNAL, IMAGE, VIDEO AND COMMUNICATIONS (ISIVC 2018), 2018, : 141 - 146
  • [15] LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents
    Wei, Hongxi
    Gao, Guanglai
    Su, Xiangdong
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV, 2016, 9950 : 432 - 441
  • [16] Hyperspectral texture recognition using a multiscale opponent representation
    Shi, MH
    Healey, G
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2003, 41 (05): : 1090 - 1095
  • [17] Handwritten Arabic and Roman word recognition using holistic approach
    Malakar, Samir
    Sahoo, Samanway
    Chakraborty, Anuran
    Sarkar, Ram
    Nasipuri, Mita
    VISUAL COMPUTER, 2023, 39 (07): : 2909 - 2932
  • [18] Printed Arabic sub-word recognition using moments
    Elrube, Ibrahim A.
    El Sonni, Mohamed T.
    Saleh, Soha S.
    World Academy of Science, Engineering and Technology, 2010, 42 : 724 - 728
  • [19] Arabic Printed Word Recognition Using Windowed Bernoulli HMMs
    Khoury, Ihab
    Gimenez, Adria
    Juan, Alfons
    Andres-Ferrer, Jesus
    IMAGE ANALYSIS AND PROCESSING (ICIAP 2013), PT 1, 2013, 8156 : 330 - 339
  • [20] A segmentation-free word spotting method for historical printed documents
    Konidaris, Thomas
    Kesidis, Anastasios L.
    Gatos, Basilis
    PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (04) : 963 - 976