Subword Recognition in Historical Arabic Documents using C-GRUs

被引:1
|
作者
Hassen, Hanadi [1 ]
Al-Madeed, Somaya [1 ]
Bouridane, Ahmed [2 ]
机构
[1] Qatar Univ, Coll Engn, Comp Sci & Engn Dept, Doha, Qatar
[2] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne NE1 8ST, Tyne & Wear, England
关键词
handwriting recognition; Arabic historical documents; CNNs; GRUs; classification; HANDWRITTEN; DEEP;
D O I
10.18421/TEM104-19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users' direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.
引用
收藏
页码:1630 / 1637
页数:8
相关论文
共 50 条
  • [41] HAH manuscripts: A holistic paradigm for classifying and retrieving historical Arabic handwritten documents
    Al Aghbari, Zaher
    Brook, Salama
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (08) : 10942 - 10951
  • [42] LIVING DOCUMENTS, DYING ARCHIVES: TOWARDS A HISTORICAL ANTHROPOLOGY OF MEDIEVAL ARABIC ARCHIVES
    El-Leithy, Tamer
    AL-QANTARA, 2011, 32 (02): : 389 - 434
  • [43] Evaluation of Feature-Embedding Methods for Word Spotting in Historical Arabic Documents
    Fathallah, Abir
    Ibn Khedher, Mohamed
    El-Yacoubi, Mounim A.
    Ben Amara, Najoua Essoukri
    PROCEEDINGS OF THE 2020 17TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD 2020), 2020, : 34 - 39
  • [44] Arabic Opinion Mining Using Distributed Representations of Documents
    El-Halees, Alaa M.
    2017 PALESTINIAN INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (PICICT), 2017, : 28 - 33
  • [45] Arabic Information Retrieval Using Semantic Analysis of Documents
    Al-Maghasbeh, Mohammad Khaled A.
    Bin Hamzah, Mohd Pouzi
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (05): : 53 - 58
  • [46] Transkribus - a Service Platform for Transcription, Recognition and Retrieval of Historical Documents
    Kahle, Philip
    Colutto, Sebastian
    Hackl, Guenter
    Muehlberger, Guenter
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 4, 2017, : 19 - 24
  • [47] A knowledge-based recognition system for historical Mongolian documents
    Su, Xiangdong
    Gao, Guanglai
    Wei, Hongxi
    Bao, Feilong
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2016, 19 (03) : 221 - 235
  • [48] Hybrid Grammar Language Model for Handwritten Historical Documents Recognition
    Cirera, Nuria
    Fornes, Alicia
    Frinken, Volkmar
    Llados, Josep
    PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2013, 2013, 7887 : 117 - 124
  • [49] Recognition of Anomalously Deformed Kana Sequences in Japanese Historical Documents
    Nam Tuan Ly
    Kha Cong Nguyen
    Cuong Tuan Nguyen
    Nakagawa, Masaki
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (08) : 1554 - 1564
  • [50] An Automatic Method for Enhancing Character Recognition in Degraded Historical Documents
    Pereira e Silva, Gabriel
    Lins, Rafael Dueire
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 553 - 557