Spectral Clustering Approach with K-Nearest Neighbor and Weighted Mahalanobis Distance for Data Mining

被引:7
|
作者
Yin, Lifeng [1 ]
Lv, Lei [1 ]
Wang, Dingyi [2 ]
Qu, Yingwei [1 ]
Chen, Huayue [3 ]
Deng, Wu [4 ,5 ]
机构
[1] Dalian Jiaotong Univ, Sch Software, Dalian 116028, Peoples R China
[2] Beijing Jiaotong Univ, Sch Elect & Informat Engn, Beijing 100044, Peoples R China
[3] China West Normal Univ, Sch Comp Sci, Nanchong 637002, Peoples R China
[4] Civil Aviat Univ China, Coll Elect Informat & Automat, Tianjin 300300, Peoples R China
[5] Southwest Jiaotong Univ, State Key Lab Tract Power, Chengdu 610031, Peoples R China
关键词
data mining; spectral clustering; Mahalanobis distance; Laplace matrix; K-means clustering;
D O I
10.3390/electronics12153284
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a spectral clustering method using k-means and weighted Mahalanobis distance (Referred to as MDLSC) to enhance the degree of correlation between data points and improve the clustering accuracy of Laplacian matrix eigenvectors. First, we used the correlation coefficient as the weight of the Mahalanobis distance to calculate the weighted Mahalanobis distance between any two data points and constructed the weighted Mahalanobis distance matrix of the data set; then, based on the weighted Mahalanobis distance matrix, we used the K-nearest neighborhood (KNN) algorithm construct similarity matrix. Secondly, the regularized Laplacian matrix was calculated according to the similarity matrix, normalized and decomposed, and the feature space for clustering was obtained. This method fully considered the degree of linear correlation between data and special spatial structure and achieved accurate clustering. Finally, various spectral clustering algorithms were used to conduct multi-angle comparative experiments on artificial and UCI data sets. The experimental results show that MDLSC has certain advantages in each clustering index and the clustering quality is better. The distribution results of the eigenvectors also show that the similarity matrix calculated by MDLSC is more reasonable, and the calculation of the eigenvectors of the Laplacian matrix maximizes the retention of the distribution characteristics of the original data, thereby improving the accuracy of the clustering algorithm.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] An Evidential K-Nearest Neighbor Classification Method with Weighted Attributes
    Jiao, Lianmeng
    Pan, Quan
    Feng, Xiaoxue
    Yang, Feng
    2013 16TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2013, : 145 - 150
  • [42] A Fast k-Nearest Neighbor Classifier Using Unsupervised Clustering
    Vajda, Szilard
    Santosh, K. C.
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 185 - 193
  • [43] Density Peaks Clustering Algorithm Based on Weighted k-Nearest Neighbors and Geodesic Distance
    Liu, Lina
    Yu, Donghua
    IEEE ACCESS, 2020, 8 : 168282 - 168296
  • [44] On kernel difference-weighted k-nearest neighbor classification
    Wangmeng Zuo
    David Zhang
    Kuanquan Wang
    Pattern Analysis and Applications, 2008, 11 : 247 - 257
  • [45] Classifying Categorical Data Using Modified K-Nearest Neighbor Weighted by Association Rules
    Muflikhah, Lailil
    Adnyana, Made Putra
    FUTURE INFORMATION TECHNOLOGY, 2011, 13 : 455 - 459
  • [46] Feature-weighted K-nearest neighbor algorithm with SVM
    Chen, Zhen-Zhou
    Li, Lei
    Yao, Zheng-An
    Zhongshan Daxue Xuebao/Acta Scientiarum Natralium Universitatis Sunyatseni, 2005, 44 (01): : 17 - 20
  • [47] On kernel difference-weighted k-nearest neighbor classification
    Zuo, Wangmeng
    Zhang, David
    Wang, Kuanquan
    PATTERN ANALYSIS AND APPLICATIONS, 2008, 11 (3-4) : 247 - 257
  • [48] Enhancing data classification using locally informed weighted k-nearest neighbor algorithm
    Abdalla, Hassan, I
    Amer, Ali A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 276
  • [49] Cumulative belief peaks evidential K-nearest neighbor clustering
    Gong, Chaoyu
    Su, Zhi-gang
    Wang, Pei-hong
    Wang, Qian
    KNOWLEDGE-BASED SYSTEMS, 2020, 200
  • [50] Fast agglomerative clustering using a k-nearest neighbor graph
    Franti, Pasi
    Virmajoki, Olli
    Hautamaki, Ville
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (11) : 1875 - 1881