Document image retrieval through word shape coding

被引:48
|
作者
Lu, Shijian [1 ]
Li, Linlin [2 ]
Tan, Chew Lim [2 ]
机构
[1] Agcy Sci Technol & Res, Inst Infocomm Res, Singapore 119613, Singapore
[2] Natl Univ Singapore, Sch Comp, Dept Comp Sci, Singapore 117543, Singapore
关键词
document image retrieval; document image analysis; word shape coding;
D O I
10.1109/TPAMI.2008.89
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a document retrieval technique that is capable of searching document images without optical character recognition (OCR). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.
引用
收藏
页码:1913 / 1918
页数:6
相关论文
共 50 条
  • [21] Document Image Retrieval: A Survey
    Tursun, Gulzira
    Aysa, Yunus
    Amrulla, Guzalnur
    Aysa, Alimjan
    Ubul, Kurban
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMMUNICATION ENGINEERING (CSCE 2015), 2015, : 1317 - 1324
  • [22] A Document Image Retrieval System
    Zagoris, Konstantinos
    Ergina, Kavallieratou
    Papamarkos, Nikos
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2010, 23 (06) : 872 - 879
  • [23] Fast Structural Matching for Document Image Retrieval through Spatial Databases
    Gao, Hongxing
    Rusinol, Marcal
    Karatzas, Dimosthenis
    Llados, Josep
    DOCUMENT RECOGNITION AND RETRIEVAL XXI, 2014, 9021
  • [24] Fractal coding for image retrieval
    Wang, Zhiyong
    Chi, Zheru
    Yu, Yinglin
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2000, 28 (06): : 19 - 23
  • [25] COMPRESSION WORD CODING TECHNIQUES FOR INFORMATION RETRIEVAL
    NUGENT, WR
    JOURNAL OF LIBRARY AUTOMATION, 1968, 1 (04): : 250 - 260
  • [26] Retrieval from document image collections
    Balasubramanian, A
    Meshesha, M
    Jawahar, C
    DOCUMENT ANALYSIS SYSTEMS VII, PROCEEDINGS, 2006, 3872 : 1 - 12
  • [27] Information retrieval in document image databases
    Lu, Y
    Tan, CL
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (11) : 1398 - 1410
  • [28] Image retrieval by shape and texture
    Pala, P
    Santini, S
    PATTERN RECOGNITION, 1999, 32 (03) : 517 - 527
  • [29] Shape measures for image retrieval
    Gagaudakis, G
    Rosin, PL
    PATTERN RECOGNITION LETTERS, 2003, 24 (15) : 2711 - 2721
  • [30] Shape measures for image retrieval
    Gagaudakis, G
    Rosin, PL
    2001 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2001, : 757 - 760