Subword Recognition in Historical Arabic Documents using C-GRUs

被引:1
|
作者
Hassen, Hanadi [1 ]
Al-Madeed, Somaya [1 ]
Bouridane, Ahmed [2 ]
机构
[1] Qatar Univ, Coll Engn, Comp Sci & Engn Dept, Doha, Qatar
[2] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
关键词
handwriting recognition; Arabic historical documents; CNNs; GRUs; classification; HANDWRITTEN; DEEP;
D O I
10.18421/TEM104-19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users' direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.
引用
收藏
页码:1630 / 1637
页数:8
相关论文
共 50 条
  • [1] Subword recognition in historical Arabic manuscripts using handcrafted features and deep learning approaches
    Dahbali, Mohamed
    Aboutabit, Noureddine
    Lamghari, Nidal
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2024,
  • [2] Opinion Expression Detection via Deep Bidirectional C-GRUs
    Xie, Xiaoxia
    2017 28TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2017, : 118 - 122
  • [3] Arabic Word Recognition System for Historical Documents using Multiscale Representation Method
    Elaiwat, Said
    Abu-Zanona, Marwan
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 823 - 830
  • [4] Arabic word recognition system for historical documents using multiscale representation method
    Elaiwat S.
    Abu-Zanona M.
    International Journal of Advanced Computer Science and Applications, 2020, 11 (04): : 823 - 830
  • [5] Named Entity Recognition of Spoken Documents using Subword Units
    Paass, Gerhard
    Pilz, Anja
    Schwenninger, Jochen
    2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 529 - 534
  • [6] VML-HD: The Historical Arabic Documents Dataset for Recognition Systems
    Kassis, Majeed
    Abdalhaleem, Alaa
    Droby, Ahmad
    Alaasam, Reem
    El-Sana, Jihad
    2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 11 - 14
  • [7] Writer Identification for Historical Arabic Documents
    Fecker, Daniel
    Asi, Abedelkadir
    Maergner, Volker
    El-Sana, Jihad
    Fingscheidt, Tim
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3050 - 3055
  • [8] DOCUMENTS IN ARABIC SCRIPT AT THE MOZAMBIQUE HISTORICAL ARCHIVES
    Bonate, Liazzat J. K.
    ISLAMIC AFRICA, 2010, 1 (02): : 253 - 257
  • [9] Spot Words in Printed Historical Arabic Documents
    Zirari, Fattah
    Ennaji, Abdel
    Mammass, Driss
    Nicolas, Stephane
    IMAGE AND SIGNAL PROCESSING, ICISP 2014, 2014, 8509 : 289 - 296
  • [10] Text Line segmentation of historical Arabic documents
    Zahour, Abderrazak
    Likforman-Sulem, Laurence
    Boussalaa, Wafa
    Taconet, Bruno
    ICDAR 2007: NINTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2007, : 138 - +