A survey of cluster validity indices for automatic data clustering using differential evolution

被引:10
|
作者
Jose-Garcia, Adan [1 ]
Gomez-Flores, Wilfrido [2 ]
机构
[1] Univ Lille, CNRS, Cent Lille, UMR 9189,CRIStAL, F-59000 Lille, France
[2] Ctr Invest & Estudios Avanzados IPN, Unidad Tamaulipas, Cd Victoria 87130, Tamaulipas, Mexico
关键词
Automatic clustering; Cluster validity index; Differential evolution; PARTICLE SWARM; OPTIMIZATION; VALIDATION; ALGORITHMS; NUMBER;
D O I
10.1145/3449639.3459341
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In cluster analysis, the automatic clustering problem refers to the determination of both the appropriate number of clusters and the corresponding natural partitioning. This can be addressed as an optimization problem in which a cluster validity index (CVI) is used as a fitness function to evaluate the quality of potential solutions. Different CVIs have been proposed in the literature, aiming to identify adequate cluster solutions in terms of intracluster cohesion and intercluster separation. However, it is important to identify the scenarios in which these CVIs perform well and their limitations. This paper evaluates the effectiveness of 22 different CVIs used as fitness functions in an evolutionary clustering algorithm named ACDE based on differential evolution. Several synthetic datasets are considered: linearly separable data having both well-separated and overlapped clusters, and non-linearly separable data having arbitrarily-shaped clusters. Besides, real-life datasets are also considered. The experimental results indicate that the Silhouette index consistently reached an acceptable performance in linearly separable data. Furthermore, the indices Calinski-Harabasz, Davies-Bouldin, and generalized Dunn obtained an adequate clustering performance in synthetic and real-life datasets. Notably, all the evaluated CVIs performed poorly in clustering the non-linearly separable data because of the assumptions about data distributions.
引用
收藏
页码:314 / 322
页数:9
相关论文
共 50 条
  • [1] A Data Clustering Tool with Cluster Validity Indices
    Qiao, Haiyan
    Edwards, Brandon
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTING, ENGINEERING AND INFORMATION, 2009, : 303 - 309
  • [2] Elastic Differential Evolution for Automatic Data Clustering
    Chen, Jun-Xian
    Gong, Yue-Jiao
    Chen, Wei-Neng
    Li, Mengting
    Zhang, Jun
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (08) : 4134 - 4147
  • [3] A new Differential Evolution based Fuzzy Clustering for Automatic Cluster Evolution
    Saha, Indrajit
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    [J]. 2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 706 - 711
  • [4] Online cluster validity indices for performance monitoring of streaming data clustering
    Moshtaghi, Masud
    Bezdek, James C.
    Erfani, Sarah M.
    Leckie, Christopher
    Bailey, James
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2019, 34 (04) : 541 - 563
  • [5] An improved differential evolution with cluster decomposition algorithm for automatic clustering
    Kuo, R. J.
    Zulvia, Ferani E.
    [J]. SOFT COMPUTING, 2019, 23 (18) : 8957 - 8973
  • [6] An improved differential evolution with cluster decomposition algorithm for automatic clustering
    R. J. Kuo
    Ferani E. Zulvia
    [J]. Soft Computing, 2019, 23 : 8957 - 8973
  • [7] An Automatic Data Clustering Algorithm based on Differential Evolution
    Tsai, Chun-Wei
    Tai, Chiech-An
    Chiang, Ming-Chao
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 794 - 799
  • [8] A comparison study of cluster validity indices using a nonhierarchical clustering algorithm
    Shim, Yosung
    Chung, Jiwon
    Choi, In-Chan
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 199 - +
  • [9] Automatic clustering using an improved differential evolution algorithm
    Das, Swagatam
    Abraham, Ajith
    Konar, Amit
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2008, 38 (01): : 218 - 237
  • [10] Two cluster validity indices for the LAMDA clustering method
    Botia Valderrama, Javier Fernando
    Luis Botia Valderrama, Diego Jose
    [J]. APPLIED SOFT COMPUTING, 2020, 89