Database for Arabic Printed Text Recognition Research

被引:0
|
作者
Jaiem, Faten Kallel [1 ]
Kanoun, Slim [1 ]
Khemakhem, Maher [1 ]
El Abed, Haikal [3 ]
Kardoun, Jihain [2 ]
机构
[1] Univ Sfax, ISIMS, MIRACL Lab, Sfax, Tunisia
[2] Univ Sfax, ENIS, Dept Comp Engn, Sfax, Tunisia
[3] Tech Univ Carolo Wilhelmina Braunschweig, Inst Commun Technol, Braunschweig, Germany
关键词
Arabic printed text; APTID / MF database; Open vocabulary; Ground truth; PATTERN-RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a real database for the Arabic printed text recognition, APTID / MF (Arabic Printed Text Image Database / Multi-Font). This database can be used to evaluate the system that recognizes Arabic printed texts with an open vocabulary. APTID / MF may be also used for research in word segmentation and font identification. APTID / MF is obtained from 387 pages of Arabic printed documents scanned with grayscale format and 300 dpi resolutions. From this documents, 1,845 text-blocks have been extracted. In addition ground truth file is provided for each texts-block. APTID / MF also includes an Arabic printed character image dataset made up of 27,402 samples. The database is freely available to interested researchers.
引用
收藏
页码:251 / 259
页数:9
相关论文
共 50 条
  • [31] Segmentation of Arabic Text into Characters for Recognition
    Shaikh, Noor Ahmed
    Shaikh, Zubair Ahmed
    Ali, Ghulam
    WIRELESS NETWORKS, INFORMATION PROCESSING AND SYSTEMS, 2008, 20 : 11 - +
  • [32] Offline arabic text recognition system
    Sarfraz, M
    Nawaz, SN
    Al-Khuraidly, A
    2003 INTERNATIONAL CONFERENCE ON GEOMETRIC MODELING AND GRAPHICS, PROCEEDINGS, 2003, : 30 - 35
  • [33] KHATT: Arabic Offline Handwritten Text Database
    Mahmoud, Sabri A.
    Ahmad, Irfan
    Alshayeb, Mohammad
    Al-Khatib, Wasfi G.
    Parvez, Mohammad Tanvir
    Fink, Gernot A.
    Maergner, Volker
    El Abed, Haikal
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 449 - 454
  • [34] A Database for Arabic Handwritten Character Recognition
    AlKhateeb, Jawad H.
    INTERNATIONAL CONFERENCE ON COMMUNICATIONS, MANAGEMENT, AND INFORMATION TECHNOLOGY (ICCMIT'2015), 2015, 65 : 556 - 561
  • [35] Comprehensive synthetic Arabic database for on/off-line script recognition research
    Saabni, Raid M.
    El-Sana, Jihad A.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2013, 16 (03) : 285 - 294
  • [36] Comprehensive synthetic Arabic database for on/off-line script recognition research
    Raid M. Saabni
    Jihad A. El-Sana
    International Journal on Document Analysis and Recognition (IJDAR), 2013, 16 : 285 - 294
  • [37] A NEW STRUCTURAL TECHNIQUE FOR RECOGNIZING PRINTED ARABIC TEXT
    ALSADOUN, HB
    AMIN, A
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1995, 9 (01) : 101 - 126
  • [38] Issues and problems in the recognition of Arabic printed texts
    Obaid, A.M.
    Periodica Polytechnica Electrical Engineering, 1997, 41 (04): : 315 - 334
  • [39] Heuristic approach to the recognition of printed Arabic script
    Obaid, AM
    Dobrowiecki, TP
    INES'97 : 1997 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS, PROCEEDINGS, 1997, : 197 - 201
  • [40] Open-vocabulary recognition of machine-printed Arabic text using hidden Markov models
    Ahmad, Irfan
    Mahmoud, Sabri A.
    Fink, Gernot A.
    PATTERN RECOGNITION, 2016, 51 : 97 - 111