A survey of document image word spotting techniques

被引:83
|
作者
Giotis, Angelos P. [1 ,2 ]
Sfikas, Giorgos [2 ]
Gatos, Basilis [2 ]
Nikou, Christophoros [1 ]
机构
[1] Univ Ioannina, Dept Comp Sci & Engn, Ioannina, Greece
[2] Natl Ctr Sci Res Demokritos, Computat Intelligence Lab, Inst Informat & Telecommun, GR-15310 Athens, Greece
关键词
Word spotting; Retrieval; Document indexing; Features; Representation; Relevance feedback; HIDDEN MARKOV-MODELS; HANDWRITTEN DOCUMENTS; TEXT LINE; SEGMENTATION; RETRIEVAL; RECOGNITION; CHARACTER; ONLINE; EXTRACTION; SIMILARITY;
D O I
10.1016/j.patcog.2017.02.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vast collections of documents available in image format need to be indexed for information retrieval purposes. In this framework, word spotting is an alternative solution to optical character recognition (OCR), which is rather inefficient for recognizing text of degraded quality and unknown fonts usually appearing in printed text, or writing style variations in handwritten documents. Over the past decade there has been a growing interest in addressing document indexing using word spotting which is reflected by the continuously increasing number of approaches. However, there exist very few comprehensive studies which analyze the various aspects of a word spotting system. This work aims to review the recent approaches as well as fill the gaps in several topics with respect to the related works. The nature of texts and inherent challenges addressed by word spotting methods are thoroughly examined. After presenting the core steps which compose a word spotting system, we investigate the use of retrieval enhancement techniques based on relevance feedback which improve the retrieved results. Finally, we present the datasets which are widely used for word spotting, we describe the evaluation standards and measures applied for performance assessment and discuss the results achieved by the state of the art. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:310 / 332
页数:23
相关论文
共 50 条
  • [41] Combination of Document Image Binarization Techniques
    Su, Bolan
    Lu, Shijian
    Tan, Chew Lim
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 22 - 26
  • [42] ACCESS TECHNIQUES FOR DOCUMENT IMAGE DATABASES
    WALKER, FL
    THOMA, GR
    LIBRARY TRENDS, 1990, 38 (04) : 751 - 786
  • [43] Document Image Quality Assessment: A Survey
    Alaei, Alireza
    Bui, Vinh
    Doermann, David
    Pal, Umapada
    ACM COMPUTING SURVEYS, 2024, 56 (02)
  • [44] A survey of historical document image datasets
    Konstantina Nikolaidou
    Mathias Seuret
    Hamam Mokayed
    Marcus Liwicki
    International Journal on Document Analysis and Recognition (IJDAR), 2022, 25 : 305 - 338
  • [45] Document Analysis Techniques for Automatic Electoral Document Processing: A Survey
    Ignacio Toledo, J.
    Cucurull, Jordi
    Puiggali, Jordi
    Fornes, Alicia
    Llados, Josep
    E-VOTING AND IDENTITY, VOTEID 2015, 2015, 9269 : 129 - 141
  • [46] Document image analysis and recognition: a survey
    Arlazarov, V. V.
    Andreeva, E., I
    Bulatov, K. B.
    Nikolaev, D. P.
    Petrova, O. O.
    Savelev, B., I
    Slavin, O. A.
    COMPUTER OPTICS, 2022, 46 (04) : 567 - 589
  • [47] A survey of historical document image datasets
    Nikolaidou, Konstantina
    Seuret, Mathias
    Mokayed, Hamam
    Liwicki, Marcus
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2022, 25 (04) : 305 - 338
  • [48] A comparison of local features for camera-based document image retrieval and spotting
    Quoc Bao Dang
    Mickaël Coustaty
    Muhammad Muzzamil Luqman
    Jean-Marc Ogier
    International Journal on Document Analysis and Recognition (IJDAR), 2019, 22 : 247 - 263
  • [49] Camera-based document image spotting system for complex linguistic maps
    Quoc Bao Dang
    Coustaty, Mickael
    Luqman, Muhammad Muzzamil
    Gally, Silvia
    Davoine, Paule-Annick
    Ogier, Jean-Marc
    Burie, Jean-Christophe
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 3246 - 3251
  • [50] A comparison of local features for camera-based document image retrieval and spotting
    Quoc Bao Dang
    Coustaty, Mickael
    Luqman, Muhammad Muzzamil
    Ogier, Jean-Marc
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (03) : 247 - 263