A survey of document image word spotting techniques

被引:83
|
作者
Giotis, Angelos P. [1 ,2 ]
Sfikas, Giorgos [2 ]
Gatos, Basilis [2 ]
Nikou, Christophoros [1 ]
机构
[1] Univ Ioannina, Dept Comp Sci & Engn, Ioannina, Greece
[2] Natl Ctr Sci Res Demokritos, Computat Intelligence Lab, Inst Informat & Telecommun, GR-15310 Athens, Greece
关键词
Word spotting; Retrieval; Document indexing; Features; Representation; Relevance feedback; HIDDEN MARKOV-MODELS; HANDWRITTEN DOCUMENTS; TEXT LINE; SEGMENTATION; RETRIEVAL; RECOGNITION; CHARACTER; ONLINE; EXTRACTION; SIMILARITY;
D O I
10.1016/j.patcog.2017.02.023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vast collections of documents available in image format need to be indexed for information retrieval purposes. In this framework, word spotting is an alternative solution to optical character recognition (OCR), which is rather inefficient for recognizing text of degraded quality and unknown fonts usually appearing in printed text, or writing style variations in handwritten documents. Over the past decade there has been a growing interest in addressing document indexing using word spotting which is reflected by the continuously increasing number of approaches. However, there exist very few comprehensive studies which analyze the various aspects of a word spotting system. This work aims to review the recent approaches as well as fill the gaps in several topics with respect to the related works. The nature of texts and inherent challenges addressed by word spotting methods are thoroughly examined. After presenting the core steps which compose a word spotting system, we investigate the use of retrieval enhancement techniques based on relevance feedback which improve the retrieved results. Finally, we present the datasets which are widely used for word spotting, we describe the evaluation standards and measures applied for performance assessment and discuss the results achieved by the state of the art. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:310 / 332
页数:23
相关论文
共 50 条
  • [1] A Review of Deep Learning Techniques in Document Image Word Spotting
    Kumari, Lalita
    Sharma, Anuj
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2022, 29 (02) : 1085 - 1106
  • [2] A Review of Deep Learning Techniques in Document Image Word Spotting
    Lalita Kumari
    Anuj Sharma
    Archives of Computational Methods in Engineering, 2022, 29 : 1085 - 1106
  • [3] Word Spotting Techniques for Indian Scripts: A survey
    Kathiriya, Himanshu M.
    Goswami, Mukesh M.
    2017 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2017,
  • [4] A survey of keyword spotting techniques for printed document images
    Murugappan, Abirami
    Ramachandran, Baskaran
    Dhavachelvan, P.
    ARTIFICIAL INTELLIGENCE REVIEW, 2011, 35 (02) : 119 - 136
  • [5] A survey of keyword spotting techniques for printed document images
    Abirami Murugappan
    Baskaran Ramachandran
    P. Dhavachelvan
    Artificial Intelligence Review, 2011, 35 : 119 - 136
  • [6] Slit style HOG feature for document image word spotting
    Terasawa, Kengo
    Tanaka, Yuzuru
    Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, 2009, : 116 - 120
  • [7] Web document image retrieval system based on word spotting
    Zagoris, K.
    Papamarkos, N.
    Chamzas, C.
    2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, : 477 - +
  • [8] A Survey on Document Image Binarization Techniques
    Lokhande, Supriya Sunil
    Dawande, N. A.
    1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 742 - 746
  • [9] REPRESENTING WORD IMAGE USING VISUAL WORD EMBEDDINGS AND RNN FOR KEYWORD SPOTTING ON HISTORICAL DOCUMENT IMAGES
    Wei, Hongxi
    Zhang, Hui
    Gao, Guanglai
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1368 - 1373
  • [10] Deep Features Representation of Word Image for Keyword Spotting in Historical Mongolian Document Images
    Wei, Hongxi
    Zhang, Jing
    Zhang, Hui
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 413 - 417