Subword Recognition in Historical Arabic Documents using C-GRUs

被引:1
|
作者
Hassen, Hanadi [1 ]
Al-Madeed, Somaya [1 ]
Bouridane, Ahmed [2 ]
机构
[1] Qatar Univ, Coll Engn, Comp Sci & Engn Dept, Doha, Qatar
[2] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
关键词
handwriting recognition; Arabic historical documents; CNNs; GRUs; classification; HANDWRITTEN; DEEP;
D O I
10.18421/TEM104-19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users' direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.
引用
收藏
页码:1630 / 1637
页数:8
相关论文
共 50 条
  • [21] Document Recognition and Translation System for Unconstrained Arabic Documents
    Cao, Huaigu
    Chen, Jinying
    Devlin, Jacob
    Prasad, Rohit
    Natarajan, Prem
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 318 - 321
  • [22] Improving Handwriting Recognition for Historical Documents Using Synthetic Text Lines
    Spoto, Martin
    Wolf, Beat
    Fischer, Andreas
    Scius-Bertrand, Anna
    INTERTWINING GRAPHONOMICS WITH HUMAN MOVEMENTS, IGS 2021, 2022, 13424 : 61 - 75
  • [23] Lanna Handwritten Character Recognition on Historical Documents Using Feature Extraction
    Khankasikam, Krisda
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 2553 - 2560
  • [24] An Azanian trio: three East African Arabic historical documents
    Coppola, Anna Rita
    AZANIA-ARCHAEOLOGICAL RESEARCH IN AFRICA, 2021, 56 (01) : 147 - 148
  • [25] Skew Correction and Text Line Extraction of Arabic Historical Documents
    Zoizon, Abdelhay
    Zarghili, Ars Alane
    Chaker, Ilham
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 181 - 193
  • [26] LARGE VOCABULARY SPEECH RECOGNITION USING SUBWORD UNITS
    LEE, CH
    GAUVAIN, JL
    PIERACCINI, R
    RABINER, LR
    SPEECH COMMUNICATION, 1993, 13 (3-4) : 263 - 279
  • [27] Holistic word recognition for handwritten historical documents
    Lavrenko, V
    Rath, TM
    Manmatha, R
    FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, : 278 - 287
  • [28] Transfer Learning for Handwriting Recognition on Historical Documents
    Granet, Adeline
    Morin, Emmanuel
    Mouchere, Harold
    Quiniou, Solen
    Viard-Gaudin, Christian
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 432 - 439
  • [29] Fast handwriting recognition for indexing historical documents
    Govindaraju, V
    Xue, HH
    FIRST INTERNATIONAL WORKSHOP ON DOCUMENT IMAGE ANALYSIS FOR LIBRARIES, PROCEEDINGS, 2004, : 314 - 320
  • [30] The Use of Object-Oriented Approach for Arabic Documents Recognition
    Albidewi, Ibrahim A.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (04): : 341 - 345