Estimating the Optimal Number of Clusters Via Internal Validity Index

被引:8
|
作者
Zhou, Shibing [1 ,2 ]
Liu, Fei [3 ]
Song, Wei [1 ,2 ]
机构
[1] Jiangnan Univ, Sch Artificial Intelligence & Comp Sci, Wuxi 214122, Jiangsu, Peoples R China
[2] Jiangnan Univ, Jiangsu Prov Engn Lab Pattern Recognit & Computat, Wuxi 214122, Jiangsu, Peoples R China
[3] Jiangnan Univ, Inst Automat, Wuxi 214122, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering validity index; Number of clusters; Affinity propagation; Hierarchical clustering; STATISTICAL COMPARISONS; CLASSIFIERS; VALIDATION;
D O I
10.1007/s11063-021-10427-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimating the optimal number of clusters (NC) is pivotal in cluster analysis. From the viewpoint of sample geometry, a novel internal clustering validity index, which is termed the between-within cluster (BWC) index, is designed in this paper. Moreover, a method is proposed to estimate the optimal NC. The BWC index improves the well-known Silhouette index. BWC validates the clustering results from a certain clustering algorithm (e.g., affinity propagation or hierarchical) and estimates the optimal NC for many kinds of data sets, including synthetic data sets, benchmark data sets, UCI data sets, gene expression data sets, and images. Theoretical analysis and experimental studies demonstrate the effectiveness and high efficiency of the new index and method.
引用
收藏
页码:1013 / 1034
页数:22
相关论文
共 50 条
  • [1] Estimating the Optimal Number of Clusters Via Internal Validity Index
    Shibing Zhou
    Fei Liu
    Wei Song
    [J]. Neural Processing Letters, 2021, 53 : 1013 - 1034
  • [2] A novel validity index for determination of the optimal number of clusters
    Kim, DJ
    Park, YW
    Park, DJ
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (02) : 281 - 285
  • [3] On cluster validity index for estimation of the optimal number of fuzzy clusters
    Kim, DW
    Lee, KH
    Lee, DH
    [J]. PATTERN RECOGNITION, 2004, 37 (10) : 2009 - 2025
  • [4] An internal validity index for arbitrarily shaped clusters
    Favati, Paola
    Menchi, Ornella
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [5] An internal validity index for arbitrarily shaped clusters
    Favati, Paola
    Menchi, Ornella
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [6] Estimating the Number of Clusters via the GUD Statistic
    Kou, Jiyao
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2014, 23 (02) : 403 - 417
  • [7] Estimating the Optimal Number of Clusters from Subsets of Ensembles
    Odebode, Afees Adegoke
    Tucker, Allan
    Arzoky, Mahir
    Swift, Stepehen
    [J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2022, : 383 - 391
  • [8] Enhanced Cluster Validity Index for the Evaluation of Optimal Number of Clusters for Fuzzy C-Means Algorithm
    Bharill, Neha
    Tiwari, Aruna
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2014, : 1526 - 1533
  • [9] An Improved Clustering Validity Index for Determining the Number of Malware Clusters
    Wang, Youyu
    Ye, Yanfang
    Chen, Haishan
    Jiang, Qingshan
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION IN COMMUNICATION, 2009, : 544 - +
  • [10] Estimating the number of clusters via a corrected clustering instability
    Haslbeck, Jonas M. B.
    Wulff, Dirk U.
    [J]. COMPUTATIONAL STATISTICS, 2020, 35 (04) : 1879 - 1894