ADOCRNet: A Deep Learning OCR for Arabic Documents Recognition

被引：3

作者：

Mosbah, Lamia ^{[1
]}

Moalla, Ikram ^{[1
,2
]}

Hamdani, Tarek M. ^{[1
,3
]}

Neji, Bilel ^{[4
]}

Beyrouthy, Taha ^{[4
]}

Alimi, Adel M. ^{[1
,5
]}

机构：

[1] Univ Sfax, Natl Engn Sch Sfax ENIS, ReGIM Lab, REs Grp Intelligent Machines, Sfax 3038, Tunisia

[2] Al Baha Univ, Coll Comp Sci & Informat Technol, Al Bahah 65511, Saudi Arabia

[3] Univ Monastir, Higher Inst Comp Sci Mahdia ISIMa, Monastir 5000, Tunisia

[4] Amer Univ Middle East, Coll Engn & Technol, Egaila 54200, Kuwait

[5] Univ Johannesburg, Fac Engn & Built Environm, Dept Elect & Elect Engn Sci, Johannesburg 3038, South Africa

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Arabic; document recognition; CNNs; CTC; deep learning; BLSTM; OCR; NEURAL-NETWORKS; CHARACTER-RECOGNITION;

D O I：

10.1109/ACCESS.2024.3379530

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, Optical character recognition (OCR) has experienced a resurgence of interest especially for contemporary Arabic data. In fact, OCR development for printed and handwritten Arabic script is still a challenging task. These challenges are due to the specific characteristics of the Arabic script. In this work, we attempt to address these challenges by creating a deep learning OCR for Arabic document recognition called ADOCRNet. It is a novel deep learning framework whose architecture is built of layers of Convolutional Neural Networks (CNNs) and Bidirectional Long Short-Term Memory (BLSTM) trained using Connectionist Temporal Classification (CTC) algorithm. In order to assess the performance of our OCR, the proposed system is performed on two printed text datasets which are P-KHATT (text line images) and APTI (word images). It's also evaluated on a handwritten Arabic text dataset IFN/ENIT (word images). According to the practical tests, the conceived model achieves strength recognition rates on the three datasets. ADOCRNet reaches a Character Error Rate (CER) of 0.01% on the P-KHATT dataset, 0.03% on the APTI dataset and a Word Error Rate (WER) of 1.09% on the IFN/ENIT dataset, which significantly outperforms the outcomes of the current systems.

引用

页码：55620 / 55631

页数：12

共 50 条

[41] Arabic Handwritten Documents Segmentation into Text-lines and Words using Deep Learning
Neche, Chemseddine
Belaid, Abdel
Kacem-Echi, Afef
2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW) AND 3RD INTERNATIONAL WORKSHOP ON ARABIC AND DERIVED SCRIPT ANALYSIS AND RECOGNITION (ASAR 2019), VOL 6, 2019, : 19 - 24
[42] Arabic handwriting recognition: Between handcrafted methods and deep learning techniques
Korichi, Aicha
Slatnia, Sihem
Aiadi, Oussama
Tagougui, Najiba
Kherallah, Monji
2020 21ST INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2020,
[43] A multi-modal deep learning system for Arabic emotion recognition
Abu Shaqra F.
Duwairi R.
Al-Ayyoub M.
International Journal of Speech Technology, 2023, 26 (01) : 123 - 139
[44] Deep Learning-Based Approach for Arabic Visual Speech Recognition
Alsulami, Nadia H.
Jamal, Amani T.
Elrefaei, Lamiaa A.
CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (01): : 85 - 108
[45] Enhancing Deep Learning with Embedded Features for Arabic Named Entity Recognition
Lotfy, Ali
Sabty, Caroline
Abdennadher, Slim
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 4904 - 4912
[46] Arabic named entity recognition via deep co-learning
Helwe, Chadi
Elbassuoni, Shady
ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (01) : 197 - 215
[47] Arabic named entity recognition via deep co-learning
Chadi Helwe
Shady Elbassuoni
Artificial Intelligence Review, 2019, 52 : 197 - 215
[48] Arabic speech recognition using end-to-end deep learning
Alsayadi, Hamzah A.
Abdelhamid, Abdelaziz A.
Hegazy, Islam
Fayed, Zaki T.
IET SIGNAL PROCESSING, 2021, 15 (08) : 521 - 534
[49] Handwritten Arabic Numeral Recognition using Deep Learning Neural Networks
Ashiquzzaman, Akm
Tushar, Abdul Kawsar
2017 IEEE INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2017,
[50] Optimizing OCR accuracy for bi-tonal, noisy scans of degraded Arabic documents
Herceg, P
Huyck, B
Johnson, C
Van Guilder, L
Kundu, A
Visual Information Processing XIV, 2005, 5817 : 179 - 187

← 1 2 3 4 5 →