A NEW APPROACH FOR DETERMINING NUMBER OF CLUSTERS

被引:0
|
作者
Erisoglu, Murat [1 ]
Erisoglu, Ulku [1 ]
Servi, Tayfun [2 ]
Sakallioglu, Sadullah [1 ]
机构
[1] Cukurova Univ, Fac Sci & Letters, Dept Stat, TR-01300 Adana, Turkey
[2] Adiyaman Univ, TR-02040 Adiyaman, Turkey
来源
PAKISTAN JOURNAL OF STATISTICS | 2012年 / 28卷 / 01期
关键词
Number of clusters; dimension reduction; sequential minimum difference; silhouette index; gap statistic; upper tail rule; DATA SET;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In cluster analysis, identifying the number of clusters in a dataset is one of the most important problems. Although there are many methods have been proposed for this manner, unfortunately there is no generally accepted procedure. Many previously offered approaches or algorithms to get over this problem either require initial values of parameters or used with some predefined clustering techniques that need complicated calculations. In this paper, a new method is developed for choosing the number of clusters based on representative values. The proposed method is easy and is computationally efficient and straightforward to estimate the number of clusters. Our proposed method to estimate the number of clusters can be called as the sequential minimum difference method. We show its effectiveness for choosing the number of clusters on some well known real datasets in cluster analysis under the assumption of non nested cluster structure and nested cluster structure cases.
引用
收藏
页码:141 / 158
页数:18
相关论文
共 50 条
  • [1] Trail-and-error approach for determining the number of clusters
    Sun, Haojun
    Sun, Mei
    [J]. ADVANCES IN MACHINE LEARNING AND CYBERNETICS, 2006, 3930 : 229 - 238
  • [2] A new evolutionary algorithm for determining the optimal number of clusters
    Lu, Wei
    Traore, Issa
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 648 - +
  • [3] A new validation index for determining the number of clusters in a data set
    Sun, HJ
    Wang, SG
    Jiang, QS
    [J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1852 - 1857
  • [4] Determining the optimal number of clusters using a new evolutionary algorithm
    Lu, W
    Traore, I
    [J]. ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 712 - 713
  • [5] An Approach for Determining the Number of Clusters in a Model-Based Cluster Analysis
    Akogul, Serkan
    Erisoglu, Murat
    [J]. ENTROPY, 2017, 19 (09):
  • [6] Determining the number of clusters in cluster analysis
    My-Young Cheong
    Hakbae Lee
    [J]. Journal of the Korean Statistical Society, 2008, 37 : 135 - 143
  • [7] Deep Embedding for Determining the Number of Clusters
    Wang, Yiqi
    Shi, Zhan
    Guo, Xifeng
    Liu, Xinwang
    Zhu, En
    Yin, Jianping
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 8173 - 8174
  • [8] Determining the number of clusters by sampling with replacement
    Tonidandel, S
    Overall, JE
    [J]. PSYCHOLOGICAL METHODS, 2004, 9 (02) : 238 - 249
  • [9] Fuzzy Clustering: Determining the Number of Clusters
    Rezankova, Hana
    Husek, Dusan
    [J]. 2012 FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ASPECTS OF SOCIAL NETWORKS (CASON), 2012, : 277 - 282
  • [10] Determining the number of clusters in cluster analysis
    Cheong, My-Young
    Lee, Hakbae
    [J]. JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2008, 37 (02) : 135 - 143