Efficient kNN Classification With Different Numbers of Nearest Neighbors

被引：905

作者：

Zhang, Shichao ^{[1
]}

Li, Xuelong ^{[2
]}

Zong, Ming ^{[1
]}

Zhu, Xiaofeng ^{[1
]}

Wang, Ruili ^{[3
]}

机构：

[1] Guangxi Normal Univ, Coll Comp Sci & Informat Technol, Guangxi Key Lab MIMS, Guilin 541004, Peoples R China

[2] Chinese Acad Sci, Xian Inst Opt & Precis Mech, Ctr OPT IMagery Anal & Learning, State Key Lab Transient Opt & Photon, Xian 710119, Shaanxi, Peoples R China

[3] Massey Univ, Inst Nat & Math Sci, Auckland 4442, New Zealand

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2018年 / 29卷 / 05期

关键词：

Decision tree; k nearest neighbor (kNN) classification; sparse coding; IMAGE; SELECTION; EXTRACTION; REGRESSION; ALGORITHM;

D O I：

10.1109/TNNLS.2017.2673241

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

k nearest neighbor (kNN) method is a popular classification method in data mining and statistics because of its simple implementation and significant classification performance. However, it is impractical for traditional kNN methods to assign a fixed k value (even though set by experts) to all test samples. Previous solutions assign different k values to different test samples by the cross validation method but are usually time-consuming. This paper proposes a kTree method to learn different optimal k values for different test/new samples, by involving a training stage in the kNN classification. Specifically, in the training stage, kTree method first learns optimal k values for all training samples by a new sparse reconstruction model, and then constructs a decision tree (namely, kTree) using training samples and the learned optimal k values. In the test stage, the kTree fast outputs the optimal k value for each test sample, and then, the kNN classification can be conducted using the learned optimal k value and all training samples. As a result, the proposed kTree method has a similar running cost but higher classification accuracy, compared with traditional kNN methods, which assign a fixed k value to all test samples. Moreover, the proposed kTree method needs less running cost but achieves similar classification accuracy, compared with the newly kNN methods, which assign different k values to different test samples. This paper further proposes an improvement version of kTree method (namely, k*Tree method) to speed its test stage by extra storing the information of the training samples in the leaf nodes of kTree, such as the training samples located in the leaf nodes, their kNNs, and the nearest neighbor of these kNNs. We call the resulting decision tree as k*Tree, which enables to conduct kNN classification using a subset of the training samples in the leaf nodes rather than all training samples used in the newly kNN methods. This actually reduces running cost of test stage. Finally, the experimental results on 20 real data sets showed that our proposed methods (i.e., kTree and k*Tree) are much more efficient than the compared methods in terms of classification tasks.

引用

页码：1774 / 1785

页数：12

共 50 条

[1] Fuzzy KNN Method With Adaptive Nearest Neighbors
Bian, Zekang
Vong, Chi Man
Wong, Pak Kin
Wang, Shitong
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (06) : 5380 - 5393
[2] PL-kNN: A Parameterless Nearest Neighbors Classifier
Jodas, Danilo Samuel
Passos, Leandro Aparecido
Adeel, Ahsan
Papa, Joao Paulo
2022 29TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2022,
[3] Adaptive Nearest Neighbors for Classification
Jhun, Myoungshic
Choi, Inkyung
KOREAN JOURNAL OF APPLIED STATISTICS, 2009, 22 (03) : 479 - 488
[4] Probabilistic Nearest Neighbors Classification
Fava, Bruno
Marques, Paulo C. F.
Lopes, Hedibert F.
ENTROPY, 2024, 26 (01)
[5] Compressed kNN: K-Nearest Neighbors with Data Compression
Salvador-Meneses, Jaime
Ruiz-Chavez, Zoila
Garcia-Rodriguez, Jose
ENTROPY, 2019, 21 (03)
[6] Bayes-Decisive Linear KNN with Adaptive Nearest Neighbors
Zhang, Jin
Bian, Zekang
Wang, Shitong
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2024, 2024
[7] Skin Disease Classification: A Comparative Analysis of K-Nearest Neighbors (KNN) and Random Forest Algorithm
Pal, Osim Kumar
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATIONS AND INFORMATION TECHNOLOGY 2021 (ICECIT 2021), 2021,
[8] Efficient Identification of Tanimoto Nearest Neighbors
Anastasiu, David C.
Karypis, George
PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016), 2016, : 156 - 165
[9] A fast K nearest neighbors classification algorithm
Pan, JS
Qiao, YL
Sun, SH
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2004, E87A (04) : 961 - 963
[10] Classification with learning k-nearest neighbors
Laaksonen, J
Oja, E
ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1480 - 1483

← 1 2 3 4 5 →