Reporting Neighbors in High-Dimensional Euclidean Space

被引:0
|
作者
Aiger, Dror [1 ]
Kaplan, Haim [2 ]
Sharir, Micha [2 ,3 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
[2] Tel Aviv Univ, Sch Comp Sci, IL-69978 Tel Aviv, Israel
[3] NYU, Courant Inst Math Sci, New York, NY 10012 USA
关键词
APPROXIMATE NEAREST-NEIGHBOR; OPTIMAL HASHING ALGORITHMS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the following problem, which arises in many database and web-based applications: Given a set P of n points in a high-dimensional space Rd and a distance r, we want to report all pairs of points of P at Euclidean distance at most r. We present two randomized algorithms, one based on randomly shifted grids, and the other on randomly shifted and rotated grids. The running time of both algorithms is of the form C (d)(n + k) log n, where k is the output size and C (d) is a constant that depends on the dimension d. The log n factor is needed to guarantee, with high probability, that all neighbor pairs are reported, and can be dropped if it suffices to report, in expectation, an arbitrarily large fraction of the pairs. When only translations are used, C (d) is of the form (a p d)d, for some (small) absolute constant a 0 : 484; this bound is worst-case tight, up to an exponential factor of about 2d. When both rotations and translations are used, C (d) can be improved to roughly 6 : 74d, getting rid of the super-exponential factor p d d. When the input set (lies in a subset of d -space that) has low doubling dimension ffi, the performance of the first algorithm improves to C (d; ffi)(n + k) log n (or to C (d; ffi)(n + k)), where C (d; ffi) = O ((ed= ffi)ffi), for ffi p d. Otherwise, C (d; ffi) = O e p d p d ffi . We also present experimental results on several large datasets, demonstrating that our algorithms run significantly faster than all the leading existing algorithms for reporting neighbors.
引用
收藏
页码:784 / 803
页数:20
相关论文
共 50 条
  • [31] Statistical properties of determinantal point processes in high-dimensional Euclidean spaces
    Scardicchio, Antonello
    Zachary, Chase E.
    Torquato, Salvatore
    [J]. PHYSICAL REVIEW E, 2009, 79 (04):
  • [32] Massively Parallel Algorithms for High-Dimensional Euclidean Minimum Spanning Tree
    Jayaram, Rajesh
    Mirrokni, Vahab
    Narayanan, Shyam
    Zhong, Peilin
    [J]. PROCEEDINGS OF THE 2024 ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, SODA, 2024, : 3960 - 3996
  • [33] High-dimensional semantic space accounts of priming
    Jones, Michael N.
    Kintsch, Walter
    Mewhort, Douglas J. K.
    [J]. JOURNAL OF MEMORY AND LANGUAGE, 2006, 55 (04) : 534 - 552
  • [34] Effective Feature Extraction in High-Dimensional Space
    Pang, Yanwei
    Yuan, Yuan
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1652 - 1656
  • [35] Action Recognition in a High-Dimensional Feature Space
    Adiguzel, Hande
    Erdem, Hayrettin
    Ferhatosmanoglu, Hakan
    Duygulu, Pinar
    [J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [36] Two representations of a high-dimensional perceptual space
    Victor, Jonathan D.
    Rizvi, Syed M.
    Conte, Mary M.
    [J]. VISION RESEARCH, 2017, 137 : 1 - 23
  • [37] High-Dimensional Geometric Streaming in Polynomial Space
    Woodruff, David P.
    Yasuda, Taisuke
    [J]. 2022 IEEE 63RD ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2022, : 732 - 743
  • [38] Central organization of a high-dimensional odor space
    Endo, Keita
    Kazama, Hokto
    [J]. CURRENT OPINION IN NEUROBIOLOGY, 2022, 73
  • [39] Image registration methods in high-dimensional space
    Neemuchwala, Huzefa
    Hero, Alfred
    Zabuawala, Sakina
    Carson, Paul
    [J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2006, 16 (05) : 130 - 145
  • [40] Querying high-dimensional data in single-dimensional space
    Cui Yu
    Stéphane Bressan
    Beng Chin Ooi
    Kian-Lee Tan
    [J]. The VLDB Journal, 2004, 13 : 105 - 119