Database for Arabic Printed Text Recognition Research

被引:0
|
作者
Jaiem, Faten Kallel [1 ]
Kanoun, Slim [1 ]
Khemakhem, Maher [1 ]
El Abed, Haikal [3 ]
Kardoun, Jihain [2 ]
机构
[1] Univ Sfax, ISIMS, MIRACL Lab, Sfax, Tunisia
[2] Univ Sfax, ENIS, Dept Comp Engn, Sfax, Tunisia
[3] Tech Univ Carolo Wilhelmina Braunschweig, Inst Commun Technol, Braunschweig, Germany
来源
IMAGE ANALYSIS AND PROCESSING (ICIAP 2013), PT 1 | 2013年 / 8156卷
关键词
Arabic printed text; APTID / MF database; Open vocabulary; Ground truth; PATTERN-RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a real database for the Arabic printed text recognition, APTID / MF (Arabic Printed Text Image Database / Multi-Font). This database can be used to evaluate the system that recognizes Arabic printed texts with an open vocabulary. APTID / MF may be also used for research in word segmentation and font identification. APTID / MF is obtained from 387 pages of Arabic printed documents scanned with grayscale format and 300 dpi resolutions. From this documents, 1,845 text-blocks have been extracted. In addition ground truth file is provided for each texts-block. APTID / MF also includes an Arabic printed character image dataset made up of 27,402 samples. The database is freely available to interested researchers.
引用
收藏
页码:251 / 259
页数:9
相关论文
共 50 条
  • [21] Developing Discrete Density Hidden Markov Models for Arabic Printed Text Recognition
    Awaida, Sameh M.
    Khorsheed, Mohammad S.
    2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND CYBERNETICS (CYBERNETICSCOM), 2012, : 35 - 39
  • [22] MACHINE RECOGNITION OF PRINTED ARABIC TEXT UTILIZING NATURAL-LANGUAGE MORPHOLOGY
    AMIN, A
    ALFEDAGHI, S
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1991, 35 (06): : 769 - 788
  • [23] Baseline Isolated Printed Text Image Database for Pashto Script Recognition
    Siddiqu, Arfa
    Basit, Abdul
    Noor, Waheed
    Khan, Muhammad Asfandyar
    Kakar, M. Saeed H.
    Khan, Azam
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (01): : 875 - 885
  • [24] A Novel Minimal Arabic Script for Preparing Databases and Benchmarks for Arabic Text Recognition Research
    Al-Muhtaseb, Husni A.
    Mahmoud, Sabri A.
    Qahwaji, Rami S.
    SIGNAL PROCESSING SYSTEMS, 2009, : 37 - +
  • [25] Recognition of printed arabic text based on global features and decision tree learning techniques
    Amin, A
    PATTERN RECOGNITION, 2000, 33 (08) : 1309 - 1323
  • [26] Recognition of off-line printed Arabic text using Hidden Markov Models
    Al-Muhtaseb, Husni A.
    Mahmoud, Sabri A.
    Qahwaji, Rami S.
    SIGNAL PROCESSING, 2008, 88 (12) : 2902 - 2912
  • [27] Efficient Recognition of Machine Printed Arabic Text Using Partial Segmentation and Hausdorff Distance
    Saabni, Raid
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 284 - 289
  • [28] Printed Arabic Script Recognition: A Survey
    Alghamdi, Mansoor
    Teahan, William
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 415 - 428
  • [29] Window repositioning for printed Arabic recognition
    Khoury, Ihab
    Gimenez, Adria
    Juan, Alfons
    Andres-Ferrer, Jesus
    PATTERN RECOGNITION LETTERS, 2015, 51 : 86 - 93
  • [30] Printed Arabic document recognition system
    Jin, JM
    Wang, H
    Ding, XQ
    Peng, LR
    DOCUMENT RECOGNITION AND RETRIEVAL XII, 2005, 5676 : 48 - 55