Nearest neighbor retrieval using distance-based hashing

被引:45
|
作者
Athitsos, Vassilis [1 ]
Potamias, Michalis [2 ]
Papapetrou, Panagiotis [2 ]
Kollios, George [2 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[2] Boston Univ, Dept Comp Sci, Boston, MA USA
关键词
D O I
10.1109/ICDE.2008.4497441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A method is proposed for indexing spaces with arbitrary distance measures, so as to achieve efficient approximate nearest neighbor retrieval. Hashing methods, such as Locality Sensitive Hashing (LSH), have been successfully applied for similarity indexing in vector spaces and string spaces under the Hamming distance. The key novelty of the hashing technique proposed here is that it can be applied to spaces with arbitrary distance measures, including non-metric distance measures. First, we describe a domain-independent method for constructing a family of binary hash functions. Then, we use these functions to construct multiple multibit hash tables. We show that the LSH formalism is not applicable for analyzing the behavior of these tables as index structures. We present a novel formulation, that uses statistical observations from sample data to analyze retrieval accuracy and efficiency for the proposed indexing method. Experiments on several real-world data sets demonstrate that our method produces good trade-offs between accuracy and efficiency, and significantly outperforms VP-trees, which are a well-known method for distance-based indexing.
引用
收藏
页码:327 / +
页数:3
相关论文
共 50 条
  • [21] A DOCUMENT-RETRIEVAL SYSTEM BASED ON NEAREST NEIGHBOR SEARCHING
    LUCARELLA, D
    JOURNAL OF INFORMATION SCIENCE, 1988, 14 (01) : 25 - 33
  • [22] Image retrieval based on weighted nearest neighbor tag prediction
    Yao, Qi
    Jiang, Dayang
    Ding, Xiancheng
    JOURNAL OF INTELLIGENT SYSTEMS, 2022, 31 (01) : 589 - 600
  • [23] A model for case retrieval based on ANN and nearest neighbor algorithm
    Zhang, Zhi-Ying
    Wang, Jian-Wei
    Wei, Xiao-Peng
    Yu, Wen-Jing
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 142 - 147
  • [24] ON TESTS OF SPATIAL RANDOMNESS USING MEAN NEAREST NEIGHBOR DISTANCE
    SINCLAIR, DF
    ECOLOGY, 1985, 66 (03) : 1084 - 1085
  • [25] Semantic Neighbor Graph Hashing for Multimodal Retrieval
    Jin, Lu
    Li, Kai
    Hu, Hao
    Qi, Guo-Jun
    Tang, Jinhui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1405 - 1417
  • [26] A Fast Speaker Identification Method Using Nearest Neighbor Distance
    Zeinali, Hossein
    Sameti, Hossein
    Babaali, Bagher
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 2159 - 2162
  • [27] An efficient nearest neighbor classifier using an adaptive distance measure
    Dehzangi, Omid
    Zolghadri, Mansoor J.
    Taheri, Shahram
    Dehzangi, Abdollah
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, PROCEEDINGS, 2007, 4673 : 970 - 978
  • [28] Estimation of Renyi Entropy of Order α Based on the Nearest Neighbor Distance
    Kim, Young-Sik
    2014 INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY AND ITS APPLICATIONS (ISITA), 2014, : 125 - 129
  • [29] HISTOGRAM-BASED FRUIT RIPENESS IDENTIFICATION USING NEAREST-NEIGHBOR DISTANCE
    Mohamad, Fatma Susilawati
    Manaf, Azizah Abdul
    Chuprat, Suriayati
    THIRD INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND TECHNOLOGY (ICCET 2011), 2011, : 483 - +
  • [30] Imbalance Data Classification Using Local Mahalanobis Distance Learning Based on Nearest Neighbor
    Siddappa N.G.
    Kampalappa T.
    SN Computer Science, 2020, 1 (2)