A Large-Scale k-Nearest Neighbor Classification Algorithm Based on Neighbor Relationship Preservation

被引:6
|
作者
Song, Yunsheng [1 ]
Kong, Xiaohan [1 ]
Zhang, Chao [1 ]
机构
[1] Shandong Agr Univ, Coll Informat Sci & Engn, Tai An 271018, Shandong, Peoples R China
关键词
CONDENSATION; SELECTION;
D O I
10.1155/2022/7409171
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Owing to the absence of hypotheses of the underlying distributions of the data and the strong generation ability, the k-nearest neighbor (kNN) classification algorithm is widely used to face recognition, text classification, emotional analysis, and other fields. However, kNN needs to compute the similarity between the unlabeled instance and all the training instances during the prediction process; it is difficult to deal with large-scale data. To overcome this difficulty, an increasing number of acceleration algorithms based on data partition are proposed. However, they lack theoretical analysis about the effect of data partition on classification performance. This paper has made a theoretical analysis of the effect using empirical risk minimization and proposed a large-scale k-nearest neighbor classification algorithm based on neighbor relationship preservation. The process of searching the nearest neighbors is converted to a constrained optimization problem. Then, it gives the estimation of the difference on the objective function value under the optimal solution with data partition and without data partition. According to the obtained estimation, minimizing the similarity of the instances in the different divided subsets can largely reduce the effect of data partition. The minibatch k-means clustering algorithm is chosen to perform data partition for its effectiveness and efficiency. Finally, the nearest neighbors of the test instance are continuously searched from the set generated by successively merging the candidate subsets until they do not change anymore, where the candidate subsets are selected based on the similarity between the test instance and cluster centers. Experiment results on public datasets show that the proposed algorithm can largely keep the same nearest neighbors and no significant difference in classification accuracy as the original kNN classification algorithm and better results than two state-of-the-art algorithms.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Comparative Analysis of K-Nearest Neighbor and Modified K-Nearest Neighbor Algorithm for Data Classification
    Okfalisa
    Mustakim
    Gazalba, Ikbal
    Reza, Nurul Gayatri Indah
    [J]. 2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 294 - 298
  • [3] An Improved K-Nearest Neighbor Algorithm for Pattern Classification
    Sultana, Zinnia
    Ferdousi, Ashifatul
    Tasnim, Farzana
    Nahar, Lutfun
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 760 - 767
  • [4] Clustering-based k-nearest neighbor classification for large-scale data with neural codes representation
    Gallego, Antonio-Javier
    Calvo-Zaragoza, Jorge
    Valero-Mas, Jose J.
    Rico-Juan, Juan R.
    [J]. PATTERN RECOGNITION, 2018, 74 : 531 - 543
  • [5] The Spatial Classification Algorithm of K-Nearest Neighbor Based on Spatial Predicate
    Ma Yu
    Gao Yuling
    Song Shaoyun
    [J]. MECHATRONICS AND INTELLIGENT MATERIALS III, PTS 1-3, 2013, 706-708 : 1928 - +
  • [6] Quantum K-nearest neighbor classification algorithm based on Hamming distance
    Jing Li
    Song Lin
    Kai Yu
    Gongde Guo
    [J]. Quantum Information Processing, 2022, 21
  • [7] Improved k-nearest neighbor classification
    Wu, YQ
    Ianakiev, K
    Govindaraju, V
    [J]. PATTERN RECOGNITION, 2002, 35 (10) : 2311 - 2318
  • [8] A quick evidential classification algorithm based on K-nearest neighbor rule
    Wang, Z
    Hu, WD
    Yu, WX
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 3248 - 3252
  • [9] A Localization Algorithm Based on Compressive Sensing by K-nearest Neighbor Classification
    Yang, Sixing
    Guo, Yan
    Liu, Xi
    Niu, Dawei
    Sun, Baoming
    [J]. PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 863 - 867
  • [10] Quantum K-nearest neighbor classification algorithm based on Hamming distance
    Li, Jing
    Lin, Song
    Yu, Kai
    Guo, Gongde
    [J]. QUANTUM INFORMATION PROCESSING, 2022, 21 (01)