Data-Dependent Hashing via Nonlinear Spectral Gaps

被引:14
|
作者
Andoni, Alexandr [1 ]
Naor, Assaf [2 ]
Nikolov, Aleksandar [3 ]
Razenshteyn, Ilya [4 ]
Waingarten, Erik [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Univ Toronto, Toronto, ON, Canada
[4] Microsoft Res Redmond, Redmond, WA USA
基金
加拿大自然科学与工程研究理事会;
关键词
Nearest neighbor search; nonlinear spectral gaps; randomized space partitions; locality-sensitive hashing; NEAREST-NEIGHBOR; APPROXIMATE; EXPANDERS;
D O I
10.1145/3188745.3188846
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We establish a generic reduction from nonlinear spectral gaps of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: For general d-dimensional normed spaces and n-point datasets, we obtain a cell-probe ANN data structure with approximation O(log d/epsilon(2)) d(O(1))n1 epsilon, and d(O(1)) n(epsilon) cell probes per query, for any epsilon > 0. No non-trivial approximation was known before in this generality other than the O(root d) bound which follows from embedding a general norm into l(2). For and Schatten-p norms, we improve the data structure further, to obtain approximation 0(p) and sublinear query time. For l(p), this improves upon the previous best approximation 2(O(P)) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a "tractable" space, such as l(1). Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.
引用
收藏
页码:787 / 800
页数:14
相关论文
共 50 条
  • [1] Optimal Data-Dependent Hashing for Approximate Near Neighbors
    Andoni, Alexandr
    Razenshteyn, Ilya
    [J]. STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 793 - 801
  • [2] Data-Dependent Hashing Based on p-Stable Distribution
    Bai, Xiao
    Yang, Haichuan
    Zhou, Jun
    Ren, Peng
    Cheng, Jian
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (12) : 5033 - 5046
  • [3] Efficient Anchor Graph Hashing with Data-Dependent Anchor Selection
    Takebe, Hiroaki
    Uehara, Yusuke
    Uchida, Seiichi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (11) : 2030 - 2033
  • [4] Similarity Learning via Optimizing the Data-Dependent Kernel
    Xiong, Huilin
    Shi, Panfei
    [J]. 2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 512 - 516
  • [5] Robustness Implies Generalization via Data-Dependent Generalization Bounds
    Kawaguchi, Kenji
    Deng, Zhun
    Luh, Kyle
    Huang, Jiaoyang
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10866 - 10894
  • [6] DATA-DEPENDENT GENERALIZATION PERFORMANCE ASSESSMENT VIA QUASICONVEX OPTIMIZATION
    Diehl, Christopher P.
    Llorens, Ashley J.
    [J]. 2008 IEEE WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2008, : 468 - 473
  • [7] IMAGE RESTORATION VIA DATA-DEPENDENT PROXIMAL AVERAGED OPTIMIZATION
    Mu, Pan
    Chen, Jian
    Liu, Risheng
    Zhong, Wei
    Fan, Xin
    Luo, Zhongxuan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2088 - 2092
  • [8] Best data-dependent triangulations
    Alboul, L
    Kloosterman, G
    Traas, C
    van Damme, R
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2000, 119 (1-2) : 1 - 12
  • [9] Denoising in Representation Space via Data-Dependent Regularization for Better Representation
    Chen, Muyi
    Wang, Daling
    Feng, Shi
    Zhang, Yifei
    [J]. MATHEMATICS, 2023, 11 (10)
  • [10] Data-dependent PAC-Bayes priors via differential privacy
    Dziugaite, Gintare Karolina
    Roy, Daniel M.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31