Efficient Computation of k-Nearest Neighbour Graphs for Large High-Dimensional Data Sets on GPU Clusters

被引:10
|
作者
Dashti, Ali [1 ]
Komarov, Ivan [1 ]
D'Souza, Roshan M. [1 ]
机构
[1] Univ Wisconsin, Complex Syst Simulat Lab, Dept Mech Engn, Milwaukee, WI 53201 USA
来源
PLOS ONE | 2013年 / 8卷 / 09期
基金
美国国家科学基金会;
关键词
CONSTRUCTION;
D O I
10.1371/journal.pone.0074113
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible k-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] GPU-Accelerated Reverse K-Nearest Neighbor Search for High-Dimensional Data
    Tsuihiji, Kyohei
    Amagasa, Toshiyuki
    ADVANCES IN NETWORK-BASED INFORMATION SYSTEMS, NBIS-2022, 2022, 526 : 279 - 288
  • [2] Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets
    Ali, Najat
    Neagu, Daniel
    Trundle, Paul
    SN APPLIED SCIENCES, 2019, 1 (12):
  • [3] Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets
    Najat Ali
    Daniel Neagu
    Paul Trundle
    SN Applied Sciences, 2019, 1
  • [4] The k-Nearest Neighbour Local Linear Estimation of the Conditional Hazard Function in High-Dimensional Statistics
    Oussama, Bouanani
    Mustapha, Mohammedi
    MATHEMATICAL METHODS OF STATISTICS, 2024, 33 (04) : 392 - 408
  • [5] A Sparse Reconstructive Evidential K-Nearest Neighbor Classifier for High-Dimensional Data
    Gong, Chaoyu
    Su, Zhi-Gang
    Wang, Pei-Hong
    Wang, Qian
    You, Yang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) : 5563 - 5576
  • [6] Sequential random k-nearest neighbor feature selection for high-dimensional data
    Park, Chan Hee
    Kim, Seoung Bum
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2336 - 2342
  • [7] Weighted k-nearest leader classifier for large data sets
    Babu, V. Suresh
    Viswanath, P.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2007, 4815 : 17 - 24
  • [8] Using the two-population genetic algorithm with distance-based k-nearest neighbour voting classifier for high-dimensional data
    Lee, Chien-Pang
    Lin, Wen-Shin
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2016, 14 (04) : 315 - 331
  • [9] K-NEAREST NEIGHBOR SEARCH: FAST GPU-BASED IMPLEMENTATIONS AND APPLICATION TO HIGH-DIMENSIONAL FEATURE MATCHING
    Garcia, Vincent
    Debreuve, Eric
    Nielsen, Frank
    Barlaud, Michel
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 3757 - 3760
  • [10] Towards Secure Approximate k-Nearest Neighbor Query Over Encrypted High-Dimensional Data
    Peng, Yanguo
    Li, Hui
    Cui, Jiangtao
    Ma, Jianfeng
    Liu, Yingfan
    IEEE ACCESS, 2018, 6 : 23137 - 23151