Browsing Heterogeneous Document Collections by a Segmentation-free Word Spotting Method

被引:78
|
作者
Rusinol, Marcal [1 ]
Aldavert, David [1 ]
Toledo, Ricardo [1 ]
Llados, Josep [1 ]
机构
[1] Univ Autonoma Barcelona, Comp Vis Ctr, Dept Ciencies Comput, Bellaterra 08193, Barcelona, Spain
关键词
Word Spotting; Heterogeneous Document Collections; Dense SIFT Features; Latent Semantic Indexing; RETRIEVAL;
D O I
10.1109/ICDAR.2011.22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.
引用
收藏
页码:63 / 67
页数:5
相关论文
共 50 条
  • [1] Efficient segmentation-free keyword spotting in historical document collections
    Rusinol, Marcal
    Aldavert, David
    Toledo, Ricardo
    Llados, Josep
    [J]. PATTERN RECOGNITION, 2015, 48 (02) : 545 - 555
  • [2] Word Hypotheses for Segmentation-free Word Spotting in Historic Document Images
    Rothacker, Leonard
    Sudholt, Sebastian
    Rusakov, Eugen
    Kasperidus, Matthias
    Fink, Gernot A.
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1174 - 1179
  • [3] Segmentation-free Word Spotting in Historical Bangla Handwritten Binarized Document
    Das, Sugata
    Mandal, Sekhar
    [J]. 2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2017, : 76 - 81
  • [4] A segmentation-free word spotting method for historical printed documents
    Konidaris, Thomas
    Kesidis, Anastasios L.
    Gatos, Basilis
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (04) : 963 - 976
  • [5] A segmentation-free word spotting method for historical printed documents
    Thomas Konidaris
    Anastasios L. Kesidis
    Basilis Gatos
    [J]. Pattern Analysis and Applications, 2016, 19 : 963 - 976
  • [6] Segmentation-free word spotting with exemplar SVMs
    Almazan, Jon
    Gordo, Albert
    Fornes, Alicia
    Valveny, Ernest
    [J]. PATTERN RECOGNITION, 2014, 47 (12) : 3967 - 3978
  • [7] Segmentation-free pattern spotting in historical document images
    En, Sovann
    Petitjean, Caroline
    Nicolas, Stephane
    Heutte, Laurent
    [J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 606 - 610
  • [8] Segmentation-free Word Spotting for Handwritten Arabic Documents
    Khaissidi, G.
    Elfakir, Y.
    Mrabti, M.
    Lakhliai, Z.
    Chenouni, D.
    El Yacoubi, M.
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2016, 4 (01): : 6 - 10
  • [9] Segmentation-free word spotting in historical Bangla handwritten document using Wave Kernel Signature
    Das, Sugata
    Mandal, Sekhar
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (02) : 593 - 610
  • [10] Segmentation-free word spotting in historical Bangla handwritten document using Wave Kernel Signature
    Sugata Das
    Sekhar Mandal
    [J]. Pattern Analysis and Applications, 2020, 23 : 593 - 610