A decision criterion for the optimal number of clusters in hierarchical clustering

被引:0
|
作者
Jung, J
Park, H
Du, DZ
Drake, BL
机构
[1] Qwest Commun, Minneapolis, MN 55413 USA
[2] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[3] Korea Inst Adv Study, Seoul 130012, South Korea
[4] CDT Inc, Minneapolis, MN 55454 USA
关键词
D O I
暂无
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Clustering has been widely used to partition data into groups so that the degree of association is high among members of the same group and low among members of different groups. Though many effective and efficient clustering algorithms have been developed and deployed, most of them still suffer from the lack of automatic or online decision for optimal number of clusters. In this paper, we define clustering gain as a measure for clustering optimality, which is based on the squared error sum as a clustering algorithm proceeds. When the measure is applied to a hierarchical clustering algorithm, an optimal number of clusters can be found. Our clustering measure shows good performance producing intuitively reasonable clustering configurations in Euclidean space according to the evidence from experimental results. Furthermore, the measure can be utilized to estimate the desired number of clusters for partitional clustering methods as well. Therefore, the clustering gain measure provides a promising technique for achieving a higher level of quality for a wide range of clustering methods.
引用
收藏
页码:91 / 111
页数:21
相关论文
共 50 条
  • [1] A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering
    Yunjae Jung
    Haesun Park
    Ding-Zhu Du
    Barry L. Drake
    [J]. Journal of Global Optimization, 2003, 25 (1) : 91 - 111
  • [2] Indexes to Find the Optimal Number of Clusters in a Hierarchical Clustering
    David Martin-Fernandez, Jose
    Maria Luna-Romera, Jose
    Pontes, Beatriz
    Riquelme-Santos, Jose C.
    [J]. 14TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS (SOCO 2019), 2020, 950 : 3 - 13
  • [3] Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering
    Zhou, Shibing
    Xu, Zhenyuan
    Liu, Fei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (12) : 3007 - 3017
  • [4] On the optimal number of clusters in histogram clustering
    Buhmann, JM
    Held, M
    [J]. CLASSIFICATION, AUTOMATION, AND NEW MEDIA, 2002, : 37 - 45
  • [5] Automatic identification of the number of clusters in hierarchical clustering
    Karna, Ashutosh
    Gibert, Karina
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01): : 119 - 134
  • [6] Automatic identification of the number of clusters in hierarchical clustering
    Ashutosh Karna
    Karina Gibert
    [J]. Neural Computing and Applications, 2022, 34 : 119 - 134
  • [7] Hierarchical clustering algorithms with automatic estimation of the number of clusters
    Abe, Ryosuke
    Miyamoto, Sadaaki
    Endo, Yasunori
    Hamasuna, Yukihiro
    [J]. 2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS), 2017,
  • [8] The upper bound of the optimal number of clusters in fuzzy clustering
    于剑
    程乾生
    [J]. Science China(Information Sciences), 2001, (02) : 119 - 125
  • [9] Clustering of fMRI data: the elusive optimal number of clusters
    Seghier, Mohamed L.
    [J]. PEERJ, 2018, 6
  • [10] The upper bound of the optimal number of clusters in fuzzy clustering
    Jian Yu
    Qiansheng Cheng
    [J]. Science in China Series : Information Sciences, 2001, 44 (2): : 119 - 125