Segmentation-Free Keyword Retrieval in Historical Document Images

被引：7

作者：

Rabaev, Irina ^{[1
]}

Dinstein, Itshak ^{[2
]}

El-Sana, Jihad ^{[1
]}

Kedem, Klara ^{[1
]}

机构：

[1] Ben Gurion Univ Negev, Dept Comp Sci, IL-84105 Beer Sheva, Israel

[2] Ben Gurion Univ Negev, Dept Elect & Comp Engn, IL-84105 Beer Sheva, Israel

来源：

IMAGE ANALYSIS AND RECOGNITION, ICIAR 2014, PT I | 2014年 / 8814卷

关键词：

Historical document processing; Keyword retrieval; Segmentation-free; Bag-of-visual-words; Kernelized locality-sensitive hashing;

D O I：

10.1007/978-3-319-11758-4_40

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a segmentation-free method to retrieve keywords from degraded historical documents. The proposed method works directly on the gray scale representation and does not require any pre-processing to enhance document images. The document images are subdivided into overlapping patches of varying sizes, where each patch is described by the bag-of-visual-words descriptor. The obtained patch descriptors are hashed into several hash tables using kernelized locality-sensitive hashing scheme for efficient retrieval. In such a scheme the search for a keyword is reduced to a small fraction of the patches from the appropriate entries in the hash tables. Since we need to capture the handwriting variations and the availability of historical documents is limited, we synthesize a small number of samples from the given query to improve the results of the retrieval process. We have tested our approach on historical document images in Hebrew from the Cairo Genizah collection, and obtained impressive results.

引用

页码：369 / 378

页数：10

共 50 条

[31] Page Segmentation of Historical Document Images with Convolutional Autoencoders
Chen, Kai
Seuret, Mathias
Liwicki, Marcus
Hennebert, Jean
Ingold, Rolf
[J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1011 - 1015
[32] Weakly supervised precise segmentation for historical document images
Xie, Zecheng
Huang, Yaoxiong
Jin, Lianwen
Liu, Yuliang
Zhu, Yuanzhi
Gao, Liangcai
Zhang, Xiaode
[J]. NEUROCOMPUTING, 2019, 350 : 271 - 281
[33] Segmentation-Free Dynamic Scene Deblurring
Kim, Tae Hyun
Lee, Kyoung Mu
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2766 - 2773
[34] Keyword spotting for cursive document retrieval
Keaton, P
Greenspan, H
Goodman, R
[J]. WORKSHOP ON DOCUMENT IMAGE ANALYSIS (DIA'97), PROCEEDINGS: IN COOPERATION WITH CVPR '97, 1997, : 74 - 81
[35] Segmentation-free skeletonization of gray-scale images via PDE's
Chung, DH
Sapiro, G
[J]. 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL II, PROCEEDINGS, 2000, : 927 - 930
[36] A Multiple Instances Approach to Improving Keyword Spotting on Historical Mongolian Document Images
Wei, Hongxi
Gao, Guanglai
Su, Xiangdong
[J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 121 - 125
[37] Segmentation-Free Ocular Detection and Recognition
Rodriguez, Andres
Panza, Jeffrey
Kumar, B. V. K. Vijaya
[J]. SENSING TECHNOLOGIES FOR GLOBAL HEALTH, MILITARY MEDICINE, DISASTER RESPONSE, AND ENVIRONMENTAL MONITORING AND BIOMETRIC TECHNOLOGY FOR HUMAN IDENTIFICATION VIII, 2011, 8029
[38] Segmentation-Free Detection of Comic Panels
Stommel, Martin
Merhej, Lena I.
Mueller, Marion G.
[J]. COMPUTER VISION AND GRAPHICS, 2012, 7594 : 633 - 640
[39] Segmentation-Free Streaming Machine Translation
Iranzo-Sanchez, Javier
Iranzo-Sanchez, Jorge
Gimenez, Adria
Civera, Jorge
Juan, Alfons
[J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 1104 - 1121
[40] Convolutional Neural Networks for Page Segmentation of Historical Document Images
Chen, Kai
Seuret, Mathias
Henneberet, Jean
Ingold, Rolf
[J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 965 - 970

← 1 2 3 4 5 →