A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality

被引:0
|
作者
Wang, Xueyi [1 ]
机构
[1] NW Nazarene Univ, Dept Math & Comp Sci, Nampa, ID 83642 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2- to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.
引用
收藏
页码:1293 / 1299
页数:7
相关论文
共 50 条
  • [1] A new fast search algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based check strategy
    Pan, Yiwei
    Pan, Zhibin
    Wang, Yikun
    Wang, Wei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 189
  • [2] K-nearest neighbors clustering algorithm
    Gauza, Dariusz
    Zukowska, Anna
    Nowak, Robert
    [J]. PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2014, 2014, 9290
  • [3] Fast agglomerative clustering using information of k-nearest neighbors
    Chang, Chih-Tang
    Lai, Jim Z. C.
    Jeng, M. D.
    [J]. PATTERN RECOGNITION, 2010, 43 (12) : 3958 - 3968
  • [4] A multilevel k-nearest neighbour learning algorithm based on k-means clustering
    Ying, Xu
    [J]. 2007 International Symposium on Computer Science & Technology, Proceedings, 2007, : 250 - 253
  • [5] The Accuracy of the k-Nearest Neighbors and k-Means Algorithms in Gesture Identification
    Guzsvinecz, Tibor
    Szűcs, Judit
    Szucs, Veronika
    Demeter, Robert
    Katona, Jozsef
    Kovari, Attila
    [J]. Infocommunications Journal, 2024, : 30 - 36
  • [6] Movie Recommender System Using K-Means Clustering AND K-Nearest Neighbor
    Ahuja, Rishabh
    Solanki, Arun
    Nayyar, Anand
    [J]. 2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 263 - 268
  • [7] Fast exact k nearest neighbors search using an orthogonal search tree
    Liaw, Yi-Ching
    Leou, Maw-Lin
    Wu, Chien-Min
    [J]. PATTERN RECOGNITION, 2010, 43 (06) : 2351 - 2358
  • [8] Graph Clustering Using Mutual K-Nearest Neighbors
    Sardana, Divya
    Bhatnagar, Raj
    [J]. ACTIVE MEDIA TECHNOLOGY, AMT 2014, 2014, 8610 : 35 - 48
  • [9] Relative density based K-nearest neighbors clustering algorithm
    Liu, QB
    Deng, S
    Lu, CH
    Wang, B
    Zhou, YF
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 133 - 137
  • [10] Fast k-nearest neighbors search using modified principal axis search tree
    Liaw, Yi-Ching
    Wu, Chien-Min
    Leou, Maw-Lin
    [J]. DIGITAL SIGNAL PROCESSING, 2010, 20 (05) : 1494 - 1501