An adaptive highly improving the accuracy of clustering algorithm based on kernel density estimation

被引:3
|
作者
Pu, Yue [1 ,2 ]
Yao, Wenbin [1 ,2 ]
Li, Xiaoyong [3 ]
Alhudhaif, Adi [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Comp Sci, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Beijing Key Lab Intelligent Software & Multimedia, Beijing 100876, Peoples R China
[3] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
[4] Prince Sattam Bin Abdulaziz Univ, Coll Comp Engn & Sci Al Kharj, Dept Comp Sci, POB 151, Al Kharj 11942, Saudi Arabia
基金
中国国家自然科学基金;
关键词
Clustering algorithm accuracy; Kernel density estimation; Outliers test; Adaptive KDE-decision graph;
D O I
10.1016/j.ins.2024.120187
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Highly Improving the Accuracy of Clustering (HIAC) algorithm is designed to enhance clustering accuracy by introducing a gravitational force between data objects, drawing them closer together, and employing a decision graph to establish a weight threshold for differentiating neighbor classes and outliers. Despite its strengths, HIAC faces two shortcomings: (1) its inability to generate effective decision graphs for small-scale datasets and (2) the non -smooth probability curve within the decision graph, making threshold determination by visual inspection both difficult and imprecise. This study presents an improved adaptive algorithm based on Kernel Density Estimation (KDE-AHIAC). This approach automatically selects the bandwidth based on the density and distribution of the data, utilizing the kernel density function to create a decision graph that applies to any dataset. For threshold selection, we introduce an adaptive calculation method that leverages the smoothness and continuity of the kernel density curve, replacing the observational approach. Additionally, we incorporate an outlier test model using Analysis of Similarity (ANOSIM) to avert misclassification of valid samples as outliers. Through comprehensive experimentation, we tested KDE-AHIAC and found that it offers notable improvements over HIAC. KDE-AHIAC enhances the clustering accuracy of the dataset by 66.05% compared to the original data and by 6.22% over HIAC.
引用
收藏
页数:25
相关论文
共 50 条
  • [1] A new algorithm for clustering based on kernel density estimation
    Matioli, L. C.
    Santos, S. R.
    Kleina, M.
    Leite, E. A.
    [J]. JOURNAL OF APPLIED STATISTICS, 2018, 45 (02) : 347 - 366
  • [2] MulticlusterKDE: a new algorithm for clustering based on multivariate kernel density estimation
    Scaldelai, D.
    Matioli, L. C.
    Santos, S. R.
    Kleina, M.
    [J]. JOURNAL OF APPLIED STATISTICS, 2022, 49 (01) : 98 - 121
  • [3] Density peaks clustering algorithm based on kernel density estimation and minimum spanning tree
    Fan, Tanghuai
    Li, Xin
    Hou, Jiazhen
    Liu, Baohong
    Kang, Ping
    [J]. International Journal of Innovative Computing and Applications, 2022, 13 (5-6) : 336 - 350
  • [4] Detection of Moving Cows Based on Adaptive Kernel Density Estimation Algorithm
    Song, Huaibo
    Yin, Xuqiang
    Wu, Dihua
    Jiang, Bo
    He, Dongjian
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2019, 50 (05): : 196 - 204
  • [5] An Adaptive Moving Objects Detection Algorithm Based On Kernel Density Estimation
    Hua, Man
    Li, Yanling
    Lin, Ruichun
    [J]. SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS II, PTS 1 AND 2, 2014, 475-476 : 983 - 986
  • [6] Clustering based on kernel density estimation: nearest local maximum searching algorithm
    Wang, WJ
    Tan, YX
    Jiang, JH
    Lu, JZ
    Shen, GL
    Yu, RQ
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2004, 72 (01) : 1 - 8
  • [7] Stream Clustering Based on Kernel Density Estimation
    Lodi, Stefano
    Moro, Gianluca
    Sartori, Claudio
    [J]. ECAI 2006, PROCEEDINGS, 2006, 141 : 799 - +
  • [8] A clustering algorithm based on density kernel extension
    Dai, Wei-Di
    He, Pi-Lian
    Hou, Yue-Xian
    Kang, Xiao-Dong
    [J]. ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 189 - 198
  • [9] Density-based Kernel Scale Estimation for Kernel Clustering
    Sellah, Sofiane
    Nasraoui, Olfa
    [J]. 2013 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS (IISA 2013), 2013, : 248 - 251
  • [10] A new interpoint distance-based clustering algorithm using kernel density estimation
    Modak, Soumita
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023,