Data-Dependent Hashing via Nonlinear Spectral Gaps

被引:14
|
作者
Andoni, Alexandr [1 ]
Naor, Assaf [2 ]
Nikolov, Aleksandar [3 ]
Razenshteyn, Ilya [4 ]
Waingarten, Erik [1 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Univ Toronto, Toronto, ON, Canada
[4] Microsoft Res Redmond, Redmond, WA USA
基金
加拿大自然科学与工程研究理事会;
关键词
Nearest neighbor search; nonlinear spectral gaps; randomized space partitions; locality-sensitive hashing; NEAREST-NEIGHBOR; APPROXIMATE; EXPANDERS;
D O I
10.1145/3188745.3188846
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We establish a generic reduction from nonlinear spectral gaps of metric spaces to data-dependent Locality-Sensitive Hashing, yielding a new approach to the high-dimensional Approximate Near Neighbor Search problem (ANN) under various distance functions. Using this reduction, we obtain the following results: For general d-dimensional normed spaces and n-point datasets, we obtain a cell-probe ANN data structure with approximation O(log d/epsilon(2)) d(O(1))n1 epsilon, and d(O(1)) n(epsilon) cell probes per query, for any epsilon > 0. No non-trivial approximation was known before in this generality other than the O(root d) bound which follows from embedding a general norm into l(2). For and Schatten-p norms, we improve the data structure further, to obtain approximation 0(p) and sublinear query time. For l(p), this improves upon the previous best approximation 2(O(P)) (which required polynomial as opposed to near-linear in n space). For the Schatten-p norm, no non-trivial ANN data structure was known before this work. Previous approaches to the ANN problem either exploit the low dimensionality of a metric, requiring space exponential in the dimension, or circumvent the curse of dimensionality by embedding a metric into a "tractable" space, such as l(1). Our new generic reduction proceeds differently from both of these approaches using a novel partitioning method.
引用
收藏
页码:787 / 800
页数:14
相关论文
共 50 条
  • [11] Denoising in Representation Space via Data-Dependent Regularization for Better Representation
    Chen, Muyi
    Wang, Daling
    Feng, Shi
    Zhang, Yifei
    [J]. MATHEMATICS, 2023, 11 (10)
  • [12] Data-dependent PAC-Bayes priors via differential privacy
    Dziugaite, Gintare Karolina
    Roy, Daniel M.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [13] Data-Dependent Generalization Bounds via Variable-Size Compressibility
    Sefidgaran, Milad
    Zaidi, Abdellatif
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (09) : 6572 - 6595
  • [14] Universal estimation of divergence for continuous distributions via data-dependent partitions
    Wang, Q
    Kulkarni, SR
    Verdú, S
    [J]. 2005 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), VOLS 1 AND 2, 2005, : 152 - 156
  • [15] Data identifiability for Data-Dependent Superimposed Training
    Whitworth, T.
    Ghogho, M.
    McLernon, D. C.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-14, 2007, : 2545 - 2550
  • [16] Data-dependent analyses in psychological research
    Mielke, PW
    Berry, KJ
    [J]. PSYCHOLOGICAL REPORTS, 2002, 91 (03) : 1225 - 1234
  • [17] Analysis and equalization of data-dependent jitter
    Buckwalter, JF
    Hajimiri, A
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2006, 41 (03) : 607 - 620
  • [18] Fast robust subspace tracking via pca in sparse data-dependent noise
    Narayanamurthy P.
    Vaswani N.
    [J]. Vaswani, Namrata (pkurpadn@iastate.edu), 2020, Institute of Electrical and Electronics Engineers Inc. (01): : 723 - 744
  • [19] Data-Dependent Approximation in Social Computing
    Wu, Weili
    Li, Yi
    Pardalos, Panos M.
    Du, Ding-Zhu
    [J]. APPROXIMATION AND OPTIMIZATION: ALGORITHMS, COMPLEXITY AND APPLICATIONS, 2019, 145 : 27 - 34
  • [20] Data-Dependent Confidentiality in DCR Graphs
    Geraldo, Eduardo
    Seco, Joao Costa
    Hildebrandt, Thomas
    [J]. PROCEEDINGS OF THE 25TH INTERNATIONAL SYMPOSIUM ON PRINCIPLES AND PRACTICE OF DECLARATIVE PROGRAMMING, PPDP 2023, 2023,