Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network

被引:0
|
作者
El Bahi, Hassan [1 ]
机构
[1] L2IS, Laboratory of Computer and Systems Engineering, Cadi Ayyad University, B.P. 511, Marrakech,40000, Morocco
关键词
Deep neural networks - Long short-term memory - Multilayer neural networks - Palmprint recognition;
D O I
10.1007/s00500-024-09930-6
中图分类号
学科分类号
摘要
Digitizing ancient manuscripts and making them accessible to a broader audience is a crucial step in unlocking the wealth of information they hold. However, automatic recognition of handwritten text and the extraction of relevant information such as named entities from these manuscripts are among the most difficult research topics, due to several factors such as poor quality of manuscripts, complex background, presence of ink stains, cursive handwriting, etc. To meet these challenges, we propose two systems, the first system performs the task of handwritten text recognition (HTR) in ancient manuscripts; it starts with a preprocessing operation. Then, a convolutional neural network (CNN) is used to extract the features of each input image. Finally, a recurrent neural network (RNN) which has Long Short-Term Memory (LSTM) blocks with the Connectionist Temporal Classification (CTC) layer will predict the text contained in the image. The second system focuses on recognizing named entities and deciphering the relationships among words directly from images of old manuscripts, bypassing the need for an intermediate text transcription step. Like the previous system, this second system starts with a preprocessing step. Then the data augmentation technique is used to increase the training dataset. After that, the extraction of the most relevant features is done automatically using a CNN model. Finally, the recognition of names entities and the relationship between word images is performed using a bidirectional LSTM. Extensive experiments on the ESPOSALLES dataset demonstrate that the proposed systems achieve the state-of-the-art performance exceeding existing systems. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
引用
收藏
页码:12249 / 12268
页数:19
相关论文
共 50 条
  • [21] Handwritten Hangul recognition using deep convolutional neural networks
    Kim, In-Jung
    Xie, Xiaohui
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2015, 18 (01) : 1 - 13
  • [22] Handwritten Hangul recognition using deep convolutional neural networks
    In-Jung Kim
    Xiaohui Xie
    International Journal on Document Analysis and Recognition (IJDAR), 2015, 18 : 1 - 13
  • [23] Deep Convolutional Neural Network Classifier for Handwritten Devanagari Character Recognition
    Singh, Pratibha
    Verma, Ajay
    Chaudhari, Narendra S.
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 2, INDIA 2016, 2016, 434 : 551 - 561
  • [24] Multi-Domain Deep Convolutional Neural Network for Ancient Urdu Text Recognition System
    Aarif, K. O. Mohammed
    Sivakumar, P.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 33 (01): : 275 - 289
  • [25] Recognition Effects of Deep Convolutional Neural Network on Smudged Handwritten Digits
    Xu, Zhe
    Terada, Yusuke
    Jia, Dongbao
    Cai, Zonghui
    Gao, Shangce
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 412 - 416
  • [26] Deep Convolutional Neural Network for Handwritten Bangla and English Digit Recognition
    Akbar, Md Ali
    Islam, Md Saiful
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND INFORMATION TECHNOLOGY 2021 (ICECIT 2021), 2021,
  • [27] Wetland Type Information Extraction Using Deep Convolutional Neural Network
    Liu, Xiaolan
    Wu, Dayong
    Wang, Hongzhi
    Liu, Jianxiao
    JOURNAL OF COASTAL RESEARCH, 2020, : 526 - 529
  • [28] Handwritten Tamil Character Recognition using Convolutional Neural Network
    Gnanasivam, P.
    Bharath, G.
    Karthikeyan, V
    Dhivya, V
    2021 SIXTH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2021, : 84 - 88
  • [29] Malayalam Handwritten Character Recognition Using Convolutional Neural Network
    Nair, Pranav P.
    James, Ajay
    Saravanan, C.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 278 - 281
  • [30] Handwritten Arabic numerals recognition using convolutional neural network
    Pratik Ahamed
    Soumyadeep Kundu
    Tauseef Khan
    Vikrant Bhateja
    Ram Sarkar
    Ayatullah Faruk Mollah
    Journal of Ambient Intelligence and Humanized Computing, 2020, 11 : 5445 - 5457