Arabic word recognition system for historical documents using multiscale representation method

被引:0
|
作者
Elaiwat S. [1 ]
Abu-Zanona M. [2 ]
机构
[1] Department of Computer Sciences, College of Computer and Information Sciences, Jouf University, Sakakah
[2] Department of Computer Sciences, College of Shari'a and Islamic Studies in Al Ahsaa, Al Imam Mohammad IbnSaud IslamicUniversity (IMSIU), Al Ahsaa
关键词
Dynamic time warping; Historical documents; Multiscale convexity concavity analysis; Word recognition;
D O I
10.14569/IJACSA.2020.01104107
中图分类号
学科分类号
摘要
In the last decades, huge efforts have been made to develop automated handwriting recognition systems. The task of recognition usually involves several complex processes including image pre-processing, segmentation, features extracting and matching. This task usually gets harder by processing historical documents as they involve skews, document degradation and structure noise. Although, the success that has been achieved in English language, the recognition of handwritten Arabic still constitutes a major challenge for many reasons. The characteristic of Arabic language, as a Semitic language, differs from other languages (e.g., European languages) in several aspects such as complex structure, implicit characters, concatenation and, writing styles and direction. This work proposes a full recognition system for the task of word recognition from from Arabic historical documents. In the proposed system, a novel feature extraction method is presented to define robust features from Arabic words. Prior Feature extraction, each input image is pre-processed and segmented resulting in segmented words. After that, the features of each word/sub-word are defined based on Multiscale Convexity Concavity(MCC) analysis of contour word shape. For feature matching, a circular shift method is proposed to burn the computational cost instead of using traditional dynamic time warping (DTW) which exhibits high computational cost. Finally, the proposed algorithm has been evaluated under well-known dataset, namely, Ibn Sina, and showed high performance for historical documents with low computational cost. © 2020 Science and Information Organization.
引用
收藏
页码:823 / 830
页数:7
相关论文
共 50 条
  • [1] Arabic Word Recognition System for Historical Documents using Multiscale Representation Method
    Elaiwat, Said
    Abu-Zanona, Marwan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 823 - 830
  • [2] Subword Recognition in Historical Arabic Documents using C-GRUs
    Hassen, Hanadi
    Al-Madeed, Somaya
    Bouridane, Ahmed
    TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2021, 10 (04): : 1630 - 1637
  • [3] Holistic word recognition for handwritten historical documents
    Lavrenko, V
    Rath, TM
    Manmatha, R
    FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, : 278 - 287
  • [4] ICFHR2014 Competition on Word Recognition from Historical Documents ANncestry Word REcognition from Segmented Historical Documents (ANWRESH)
    Reese, Jackson
    Murdock, Michael
    Reid, Shawn
    Hamilton, Blaine
    2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2014, : 803 - 808
  • [5] Word Stretching for Effective Segmentation and Classification of Historical Arabic Handwritten Documents
    Al Aghbari, Zaher
    Brook, Salama
    RCIS 2009: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON RESEARCH CHALLENGES IN INFORMATION SCIENCE, 2009, : 217 - 224
  • [6] Wavelet network for recognition system of Arabic word
    Ejbali, Ridha
    Zaied, Mourad
    Ben Amar, Chokri
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2010, 13 (03) : 163 - 174
  • [7] Evaluation of Feature-Embedding Methods for Word Spotting in Historical Arabic Documents
    Fathallah, Abir
    Ibn Khedher, Mohamed
    El-Yacoubi, Mounim A.
    Ben Amara, Najoua Essoukri
    PROCEEDINGS OF THE 2020 17TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD 2020), 2020, : 34 - 39
  • [8] VML-HD: The Historical Arabic Documents Dataset for Recognition Systems
    Kassis, Majeed
    Abdalhaleem, Alaa
    Droby, Ahmad
    Alaasam, Reem
    El-Sana, Jihad
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 11 - 14
  • [9] Document Recognition and Translation System for Unconstrained Arabic Documents
    Cao, Huaigu
    Chen, Jinying
    Devlin, Jacob
    Prasad, Rohit
    Natarajan, Prem
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 318 - 321
  • [10] Word spotting for Handwritten Arabic documents using Harris detector
    Elfakiri, Youssef
    Chenouni, Driss
    Khaissidi, Ghizlane
    El Yacoubi, Mounim
    Mrabti, Mostafa
    2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY FOR ORGANIZATIONS DEVELOPMENT (IT4OD), 2016,