Nearest neighbor retrieval using distance-based hashing

被引:45
|
作者
Athitsos, Vassilis [1 ]
Potamias, Michalis [2 ]
Papapetrou, Panagiotis [2 ]
Kollios, George [2 ]
机构
[1] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[2] Boston Univ, Dept Comp Sci, Boston, MA USA
关键词
D O I
10.1109/ICDE.2008.4497441
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A method is proposed for indexing spaces with arbitrary distance measures, so as to achieve efficient approximate nearest neighbor retrieval. Hashing methods, such as Locality Sensitive Hashing (LSH), have been successfully applied for similarity indexing in vector spaces and string spaces under the Hamming distance. The key novelty of the hashing technique proposed here is that it can be applied to spaces with arbitrary distance measures, including non-metric distance measures. First, we describe a domain-independent method for constructing a family of binary hash functions. Then, we use these functions to construct multiple multibit hash tables. We show that the LSH formalism is not applicable for analyzing the behavior of these tables as index structures. We present a novel formulation, that uses statistical observations from sample data to analyze retrieval accuracy and efficiency for the proposed indexing method. Experiments on several real-world data sets demonstrate that our method produces good trade-offs between accuracy and efficiency, and significantly outperforms VP-trees, which are a well-known method for distance-based indexing.
引用
收藏
页码:327 / +
页数:3
相关论文
共 50 条
  • [41] ADAPTIVE BIT ALLOCATION HASHING FOR APPROXIMATE NEAREST NEIGHBOR SEARCH
    Guo, Qin-Zhen
    Zeng, Zhi
    Zhang, Shuwu
    Zhang, Yuan
    Wang, Fangyuan
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [42] Batch nearest neighbor search for video retrieval
    Jie Shao
    Zi Huang
    Shen, Heng Tao
    Zhou, Xiaofang
    Lim, Ee-Peng
    Li, Yijun
    IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (03) : 409 - 420
  • [43] T-copula and Wasserstein distance-based stochastic neighbor embedding
    Huang, Yanyong
    Guo, Kejun
    Yi, Xiuwen
    Yu, Jing
    Shen, Zongxin
    Li, Tianrui
    KNOWLEDGE-BASED SYSTEMS, 2022, 243
  • [44] Distance Maximization and Defences on Deep Hashing Based Image Retrieval
    Lu, Junda
    Miao, Yukai
    Chen, Mingyang
    Huang, Bo
    Li, Bing
    Wang, Wei
    Vatsalan, Dinusha
    Kaafar, Mohamed Ali
    2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG, 2023, : 176 - 183
  • [45] Subset Retrieval Nearest Neighbor Machine Translation
    Deguchi, Hiroyuki
    Watanabe, Taro
    Matsui, Yusuke
    Utiyama, Masao
    Tanaka, Hideki
    Sumita, Eiichiro
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 174 - 189
  • [46] Distance-based relevance feedback using a hybrid interactive genetic algorithm for image retrieval
    Arevalillo-Herraez, Miguel
    Ferri, Francesc J.
    Moreno-Picot, Salvador
    APPLIED SOFT COMPUTING, 2011, 11 (02) : 1782 - 1791
  • [47] Improving the accuracy of k-nearest neighbor using local mean based and distance weight
    Syaliman, K. U.
    Nababan, E. B.
    Sitompul, O. S.
    2ND INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2017, 2018, 978
  • [48] Voice Recognition using k Nearest Neighbor and Double Distance Method
    Ranny
    2016 INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING, MANAGEMENT SCIENCE AND APPLICATIONS (ICIMSA), 2016,
  • [49] Case study: Distance-based image retrieval in the MoBIoS DBMS
    Mao, R
    Iqbal, Q
    Liu, WG
    Miranker, DP
    FIFTH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY - PROCEEDINGS, 2005, : 49 - 55
  • [50] Distance metric learning based on the class center and nearest neighbor relationship
    Zhao, Yifeng
    Yang, Liming
    NEURAL NETWORKS, 2023, 164 : 631 - 644