Spectral Clustering Approach with K-Nearest Neighbor and Weighted Mahalanobis Distance for Data Mining

被引:7
|
作者
Yin, Lifeng [1 ]
Lv, Lei [1 ]
Wang, Dingyi [2 ]
Qu, Yingwei [1 ]
Chen, Huayue [3 ]
Deng, Wu [4 ,5 ]
机构
[1] Dalian Jiaotong Univ, Sch Software, Dalian 116028, Peoples R China
[2] Beijing Jiaotong Univ, Sch Elect & Informat Engn, Beijing 100044, Peoples R China
[3] China West Normal Univ, Sch Comp Sci, Nanchong 637002, Peoples R China
[4] Civil Aviat Univ China, Coll Elect Informat & Automat, Tianjin 300300, Peoples R China
[5] Southwest Jiaotong Univ, State Key Lab Tract Power, Chengdu 610031, Peoples R China
关键词
data mining; spectral clustering; Mahalanobis distance; Laplace matrix; K-means clustering;
D O I
10.3390/electronics12153284
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a spectral clustering method using k-means and weighted Mahalanobis distance (Referred to as MDLSC) to enhance the degree of correlation between data points and improve the clustering accuracy of Laplacian matrix eigenvectors. First, we used the correlation coefficient as the weight of the Mahalanobis distance to calculate the weighted Mahalanobis distance between any two data points and constructed the weighted Mahalanobis distance matrix of the data set; then, based on the weighted Mahalanobis distance matrix, we used the K-nearest neighborhood (KNN) algorithm construct similarity matrix. Secondly, the regularized Laplacian matrix was calculated according to the similarity matrix, normalized and decomposed, and the feature space for clustering was obtained. This method fully considered the degree of linear correlation between data and special spatial structure and achieved accurate clustering. Finally, various spectral clustering algorithms were used to conduct multi-angle comparative experiments on artificial and UCI data sets. The experimental results show that MDLSC has certain advantages in each clustering index and the clustering quality is better. The distribution results of the eigenvectors also show that the similarity matrix calculated by MDLSC is more reasonable, and the calculation of the eigenvectors of the Laplacian matrix maximizes the retention of the distribution characteristics of the original data, thereby improving the accuracy of the clustering algorithm.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Spectral Clustering Based on k-Nearest Neighbor Graph
    Lucinska, Malgorzata
    Wierzchon, Lawomir T.
    COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL MANAGEMENT (CISIM), 2012, 7564 : 254 - 265
  • [2] NOTE ON DISTANCE-WEIGHTED K-NEAREST NEIGHBOR RULES
    BAILEY, T
    JAIN, AK
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1978, 8 (04): : 311 - 313
  • [3] Weighted K-Nearest Neighbor Revisited
    Bicego, M.
    Loog, M.
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 1642 - 1647
  • [4] Adaptive Mahalanobis Distance and k-Nearest Neighbor Rule for Fault Detection in Semiconductor Manufacturing
    Verdier, Ghislain
    Ferreira, Ariane
    IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2011, 24 (01) : 59 - 68
  • [5] A REEXAMINATION OF THE DISTANCE-WEIGHTED K-NEAREST NEIGHBOR CLASSIFICATION RULE
    MACLEOD, JES
    LUK, A
    TITTERINGTON, DM
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1987, 17 (04): : 689 - 696
  • [6] Refining a k-nearest neighbor graph for a computationally efficient spectral clustering
    Alshammari, Mashaan
    Stavrakakis, John
    Takatsuka, Masahiro
    PATTERN RECOGNITION, 2021, 114
  • [7] Spectral Clustering with Reverse Soft K-Nearest Neighbor Density Estimation
    Kursun, Olcay
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [8] Enhanced Weighted K-nearest Neighbor Positioning
    Li, Xinze
    Al-Tous, Hanan
    Hajri, Salah Eddine
    Tirkkonen, Olav
    2024 IEEE 99TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2024-SPRING, 2024,
  • [9] Graph Clustering with K-Nearest Neighbor Constraints
    Jakawat, Wararat
    Makkhongkaew, Raywat
    2019 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE 2019), 2019, : 309 - 313
  • [10] An optimized K-Nearest Neighbor algorithm based on Dynamic Distance approach
    Sadrabadi, Aireza Naser
    Znjirchi, Seyed Mahmood
    Abadi, Habib Zare Ahmad
    Hajimoradi, Ahmad
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,