Nearest neighbor retrieval using distance-based hashing

被引:45
|
作者
Athitsos, Vassilis [1 ]
Potamias, Michalis [2 ]
Papapetrou, Panagiotis [2 ]
Kollios, George [2 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[2] Boston Univ, Dept Comp Sci, Boston, MA USA
关键词
D O I
10.1109/ICDE.2008.4497441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A method is proposed for indexing spaces with arbitrary distance measures, so as to achieve efficient approximate nearest neighbor retrieval. Hashing methods, such as Locality Sensitive Hashing (LSH), have been successfully applied for similarity indexing in vector spaces and string spaces under the Hamming distance. The key novelty of the hashing technique proposed here is that it can be applied to spaces with arbitrary distance measures, including non-metric distance measures. First, we describe a domain-independent method for constructing a family of binary hash functions. Then, we use these functions to construct multiple multibit hash tables. We show that the LSH formalism is not applicable for analyzing the behavior of these tables as index structures. We present a novel formulation, that uses statistical observations from sample data to analyze retrieval accuracy and efficiency for the proposed indexing method. Experiments on several real-world data sets demonstrate that our method produces good trade-offs between accuracy and efficiency, and significantly outperforms VP-trees, which are a well-known method for distance-based indexing.
引用
收藏
页码:327 / +
页数:3
相关论文
共 50 条
  • [31] Improvement of PCA-Based Approximate Nearest Neighbor Search Using Distance Statistics
    Ogita, Toshiro
    Ichihashi, Hidetomo
    Notsu, Akira
    Honda, Katsuhiro
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2014, 18 (04) : 658 - 664
  • [32] A Semantic Distance Based Nearest Neighbor Method for Image Annotation
    Wu, Wei
    Gao, Guanglai
    Nie, Jianyun
    JOURNAL OF COMPUTERS, 2014, 9 (10) : 2274 - 2280
  • [33] A Correlation-Based Distance Function for Nearest Neighbor Classification
    Rodriguez, Yanet
    De Baets, Bernard
    Garcia, Maria M.
    Morell, Carlos
    Grau, Ricardo
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2008, 5197 : 284 - +
  • [34] Heterogeneous Information Network Hashing for Fast Nearest Neighbor Search
    Peng, Zhen
    Luo, Minnan
    Li, Jundong
    Chen, Chen
    Zheng, Qinghua
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2019), PT I, 2019, 11446 : 571 - 586
  • [35] A HASHING-ORIENTED NEAREST-NEIGHBOR SEARCHING SCHEME
    CHANG, CC
    WU, TC
    PATTERN RECOGNITION LETTERS, 1993, 14 (08) : 625 - 630
  • [36] Principal Component Hashing: An Accelerated Approximate Nearest Neighbor Search
    Matsushita, Yusuke
    Wada, Toshikazu
    ADVANCES IN IMAGE AND VIDEO TECHNOLOGY, PROCEEDINGS, 2009, 5414 : 374 - 385
  • [37] Manifold-ranking based retrieval using k-regular nearest neighbor graph
    Wang, Bin
    Pan, Feng
    Hu, Kai-Mo
    Paul, Jean-Claude
    PATTERN RECOGNITION, 2012, 45 (04) : 1569 - 1577
  • [38] Distance-based nearest neighbour forecasting with application to exchange rate predictability
    Kyriazi, Foteini
    Thomakos, Dimitrios D.
    IMA JOURNAL OF MANAGEMENT MATHEMATICS, 2020, 31 (04) : 469 - 490
  • [39] Adaptive bit allocation hashing for approximate nearest neighbor search
    Guo, Qin-Zhen
    Zeng, Zhi
    Zhang, Shuwu
    NEUROCOMPUTING, 2015, 151 : 719 - 728
  • [40] Shared Nearest Neighbor Clustering in a Locality Sensitive Hashing Framework
    Kanj, Sawsan
    Bruls, Thomas
    Gazut, Stephane
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2018, 25 (02) : 236 - 250