Spectral Clustering Approach with K-Nearest Neighbor and Weighted Mahalanobis Distance for Data Mining

被引:7
|
作者
Yin, Lifeng [1 ]
Lv, Lei [1 ]
Wang, Dingyi [2 ]
Qu, Yingwei [1 ]
Chen, Huayue [3 ]
Deng, Wu [4 ,5 ]
机构
[1] Dalian Jiaotong Univ, Sch Software, Dalian 116028, Peoples R China
[2] Beijing Jiaotong Univ, Sch Elect & Informat Engn, Beijing 100044, Peoples R China
[3] China West Normal Univ, Sch Comp Sci, Nanchong 637002, Peoples R China
[4] Civil Aviat Univ China, Coll Elect Informat & Automat, Tianjin 300300, Peoples R China
[5] Southwest Jiaotong Univ, State Key Lab Tract Power, Chengdu 610031, Peoples R China
关键词
data mining; spectral clustering; Mahalanobis distance; Laplace matrix; K-means clustering;
D O I
10.3390/electronics12153284
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a spectral clustering method using k-means and weighted Mahalanobis distance (Referred to as MDLSC) to enhance the degree of correlation between data points and improve the clustering accuracy of Laplacian matrix eigenvectors. First, we used the correlation coefficient as the weight of the Mahalanobis distance to calculate the weighted Mahalanobis distance between any two data points and constructed the weighted Mahalanobis distance matrix of the data set; then, based on the weighted Mahalanobis distance matrix, we used the K-nearest neighborhood (KNN) algorithm construct similarity matrix. Secondly, the regularized Laplacian matrix was calculated according to the similarity matrix, normalized and decomposed, and the feature space for clustering was obtained. This method fully considered the degree of linear correlation between data and special spatial structure and achieved accurate clustering. Finally, various spectral clustering algorithms were used to conduct multi-angle comparative experiments on artificial and UCI data sets. The experimental results show that MDLSC has certain advantages in each clustering index and the clustering quality is better. The distribution results of the eigenvectors also show that the similarity matrix calculated by MDLSC is more reasonable, and the calculation of the eigenvectors of the Laplacian matrix maximizes the retention of the distribution characteristics of the original data, thereby improving the accuracy of the clustering algorithm.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] An Improved Weighted K-Nearest Neighbor Algorithm for Indoor Positioning
    Changgeng Li
    Zhengyang Qiu
    Changtong Liu
    Wireless Personal Communications, 2017, 96 : 2239 - 2251
  • [32] Quantum K-nearest neighbors classification algorithm based on Mahalanobis distance
    Gao, Li-Zhen
    Lu, Chun-Yue
    Guo, Gong-De
    Zhang, Xin
    Lin, Song
    FRONTIERS IN PHYSICS, 2022, 10
  • [33] A weighted k-nearest neighbor density estimate for geometric inference
    Biau, Gerard
    Chazal, Frederic
    Cohen-Steiner, David
    Devroye, Luc
    Rodriguez, Carlos
    ELECTRONIC JOURNAL OF STATISTICS, 2011, 5 : 204 - 237
  • [34] An Improved Weighted K-Nearest Neighbor Algorithm for Indoor Localization
    Peng, Xuesheng
    Chen, Ruizhi
    Yu, Kegen
    Ye, Feng
    Xue, Weixing
    ELECTRONICS, 2020, 9 (12) : 1 - 14
  • [35] WEIGHTED K-NEAREST NEIGHBOR METHOD FOR THE CALCULATION OF MISSING VALUES
    TODESCHINI, R
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1990, 9 (02) : 201 - 205
  • [36] Comparison of Accuracy Estimation for Weighted k-Nearest Neighbor Classifiers
    Zhao, Ming
    Chen, Jingchao
    Xu, Mengyao
    FUZZY SYSTEMS AND DATA MINING V (FSDM 2019), 2019, 320 : 783 - 791
  • [37] A parameter independent fuzzy weighted k-Nearest neighbor classifier
    Biswas, Nimagna
    Chakraborty, Saurajit
    Mullick, Sankha Subhra
    Das, Swagatam
    PATTERN RECOGNITION LETTERS, 2018, 101 : 80 - 87
  • [38] MAP-REDUCE BASED DISTANCE WEIGHTED K-NEAREST NEIGHBOR MACHINE LEARNING ALGORITHM FOR BIG DATA APPLICATIONS
    Gothai, E.
    Muthukumaran, V.
    Valarmathi, K.
    Sathishkumar, V. E.
    Thillaiarasu, N.
    Karthikeyan, P.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2022, 23 (04): : 129 - 145
  • [39] An Improved Weighted K-Nearest Neighbor Algorithm for Indoor Positioning
    Li, Changgeng
    Qiu, Zhengyang
    Liu, Changtong
    WIRELESS PERSONAL COMMUNICATIONS, 2017, 96 (02) : 2239 - 2251
  • [40] Fuzzy Monotonic K-Nearest Neighbor Versus Monotonic Fuzzy K-Nearest Neighbor
    Zhu, Hong
    Wang, Xizhao
    Wang, Ran
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (09) : 3501 - 3513