Arabic word recognition system for historical documents using multiscale representation method

被引:0
|
作者
Elaiwat S. [1 ]
Abu-Zanona M. [2 ]
机构
[1] Department of Computer Sciences, College of Computer and Information Sciences, Jouf University, Sakakah
[2] Department of Computer Sciences, College of Shari'a and Islamic Studies in Al Ahsaa, Al Imam Mohammad IbnSaud IslamicUniversity (IMSIU), Al Ahsaa
关键词
Dynamic time warping; Historical documents; Multiscale convexity concavity analysis; Word recognition;
D O I
10.14569/IJACSA.2020.01104107
中图分类号
学科分类号
摘要
In the last decades, huge efforts have been made to develop automated handwriting recognition systems. The task of recognition usually involves several complex processes including image pre-processing, segmentation, features extracting and matching. This task usually gets harder by processing historical documents as they involve skews, document degradation and structure noise. Although, the success that has been achieved in English language, the recognition of handwritten Arabic still constitutes a major challenge for many reasons. The characteristic of Arabic language, as a Semitic language, differs from other languages (e.g., European languages) in several aspects such as complex structure, implicit characters, concatenation and, writing styles and direction. This work proposes a full recognition system for the task of word recognition from from Arabic historical documents. In the proposed system, a novel feature extraction method is presented to define robust features from Arabic words. Prior Feature extraction, each input image is pre-processed and segmented resulting in segmented words. After that, the features of each word/sub-word are defined based on Multiscale Convexity Concavity(MCC) analysis of contour word shape. For feature matching, a circular shift method is proposed to burn the computational cost instead of using traditional dynamic time warping (DTW) which exhibits high computational cost. Finally, the proposed algorithm has been evaluated under well-known dataset, namely, Ibn Sina, and showed high performance for historical documents with low computational cost. © 2020 Science and Information Organization.
引用
收藏
页码:823 / 830
页数:7
相关论文
共 50 条
  • [31] Word Image Representation Based on Visual Embeddings and Spatial Constraints for Keyword Spotting on Historical Documents
    Wei, Hongxi
    Zhang, Hui
    Gao, Guanglai
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 3616 - 3621
  • [32] Arabic handwritten word recognition using HMMs with explicit state duration
    Benouareth, A.
    Ennaji, A.
    Sellami, M.
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2008, 2008 (1)
  • [33] Arabic isolated word recognition using general regression neural network
    Amrouche, A
    Rouvaen, JM
    Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, 2003, : 689 - 692
  • [34] Arabic Handwritten Word Recognition Using HMMs with Explicit State Duration
    A. Benouareth
    A. Ennaji
    M. Sellami
    EURASIP Journal on Advances in Signal Processing, 2008
  • [35] Multi-font Arabic word recognition using spectral features
    Khorsheed, MS
    Clocksin, WF
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 543 - 546
  • [36] Learning-free handwritten word spotting method for historical handwritten documents
    Mohammed, Hanadi Hassen
    Subramanian, Nandhini
    Al-Madeed, Somaya
    IET IMAGE PROCESSING, 2021, 15 (10) : 2332 - 2341
  • [37] HADARA - A Software System for Semi-Automatic Processing of Historical Handwritten Arabic Documents
    Pantke, Werner
    Maergner, Volker
    Fecker, Daniel
    Fingscheidt, Tim
    Asi, Abedelkadir
    Biller, Ofer
    El-Sana, Jihad
    Saabni, Raid
    Yehia, Mohammad
    ARCHIVING 2013: FINAL PROGRAM AND PROCEEDINGS, 2013, : 161 - +
  • [38] A human-inspired recognition system for premodern Japanese historical documents
    Le, Anh Duc
    Clanuwat, Tarin
    Kitamoto, Asanobu
    arXiv, 2019,
  • [39] A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents
    Fischer, Andreas
    Baechler, Micheal
    Garz, Angelika
    Liwicki, Marcus
    Ingold, Rolf
    2014 11TH IAPR INTERNATIONAL WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS (DAS 2014), 2014, : 71 - 75
  • [40] Word matching using single closed contours for indexing handwritten historical documents
    Tomasz Adamek
    Noel E. O’Connor
    Alan F. Smeaton
    International Journal of Document Analysis and Recognition (IJDAR), 2007, 9 : 153 - 165