Recognition-free Retrieval of Old Arabic Document Images

被引:0
|
作者
Sari, Toufik [1 ,2 ]
Kefali, Abderrahmane [1 ]
机构
[1] Univ Badji Mokhtar, Lab Gest Elect Documents LabGED, Annaba, Algeria
[2] Badji Mokhtar Annaba Univ, Dept Comp Sci & Informat Engn, Annaba, Algeria
来源
COMPUTACION Y SISTEMAS | 2011年 / 15卷 / 02期
关键词
Document retrieval; Arabic handwriting recognition; approximate string matching; document analysis;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Searching of old document images is a relevant issue today. In this paper, we tackle the problem of old Arabic document images retrieval which form a good part of our heritage and possess an inestimable scientific and cultural richness. We propose an approach for indexing and searching degraded document images without recognizing the textual patterns in order to avoid the high cost and the difficult effort of the optical character recognition (OCR). Our basic idea consists in casting the problem of document images retrieval from the field of document analysis to the field of information retrieval. Thus, we can combine symbolic notation and semic representation and exploit techniques from the two fields, in particular, the techniques of suffix trees and approximate string matching. Each document of the collection is assigned an ASCII file of word codes. Words are represented by their topological features, namely, ascenders, descenders, etc. So, instead of searching in the image, we look for word codes in the corresponding file code. The tests performed on two types of documents, Arabic historical documents and Algerian postal envelopes, have showed good performance of the proposed approach.
引用
收藏
页码:195 / 208
页数:14
相关论文
共 50 条
  • [1] Document Image Retrieval Based on Texture Features: A Recognition-Free Approach
    Alaei, Fahimeh
    Alaei, Alireza
    Pal, Umapada
    Blumenstein, Michael
    [J]. 2016 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2016, : 456 - 462
  • [2] Recognition-Free Question Answering on Handwritten Document Collections
    Tueselmann, Oliver
    Mueller, Friedrich
    Wolf, Fabian
    Fink, Gernot A.
    [J]. FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 259 - 273
  • [3] Towards a segmentation and recognition-free approach for content-based document image retrieval of handwritten queries
    Chatbri, Houssem
    Kameyama, Keisuke
    Kwan, Paul
    [J]. Proceedings 3rd IAPR Asian Conference on Pattern Recognition ACPR 2015, 2015, : 146 - 150
  • [4] Do recognition-free recall discrepancies detect retrieval deficits? Response
    Wilde, MC
    Boake, C
    Sherer, M
    [J]. JOURNAL OF CLINICAL AND EXPERIMENTAL NEUROPSYCHOLOGY, 1997, 19 (01) : 153 - 155
  • [5] Individual Recognition-Free Target Enclosure Model
    Kubo, Masao
    Yoshimura, Tatsurou
    Yamaguchi, Akihiro
    Sato, Hiroshi
    [J]. PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 17TH '12), 2012, : 608 - 613
  • [6] A structural description of binary document images: Application for Arabic character recognition
    Zidouri, A
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGING SCIENCE, SYSTEMS AND TECHNOLOGY, VOLS I AND II, 2001, : 458 - 464
  • [7] Individual recognition-free target enclosure model
    Kubo M.
    Yoshimura T.
    Yamaguchi A.
    Sato H.
    [J]. Artificial Life and Robotics, 2012, 17 (1) : 11 - 16
  • [8] A recognition-free mechanism for reliable rejection of brood parasites
    Anderson, Michael G.
    Hauber, Mark E.
    [J]. TRENDS IN ECOLOGY & EVOLUTION, 2007, 22 (06) : 283 - 286
  • [9] Segmentation-Free Keyword Retrieval in Historical Document Images
    Rabaev, Irina
    Dinstein, Itshak
    El-Sana, Jihad
    Kedem, Klara
    [J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2014, PT I, 2014, 8814 : 369 - 378
  • [10] Arabic Document Indexing for Improved Text Retrieval
    Al-Lahham, Yaser A. M.
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 226 - 230