Agglomerative fuzzy K-Means clustering algorithm with selection of number of clusters

被引:183
|
作者
Li, Mark Junjie [1 ]
Ng, Michael K. [1 ,2 ]
Cheung, Yiu-ming [3 ]
Huang, Joshua Zhexue [4 ]
机构
[1] Hong Kong Baptist Univ, Dept Math, Kowloon Tong, Hong Kong, Peoples R China
[2] Hong Kong Baptist Univ, Ctr Math Imaging & Vis, Kowloon Tong, Hong Kong, Peoples R China
[3] Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
[4] Univ Hong Kong, E Business Technol Inst, Hong Kong, Hong Kong, Peoples R China
关键词
fuzzy K-Means clustering; agglomerative; number of clusters; cluster validation;
D O I
10.1109/TKDE.2008.88
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an agglomerative fuzzy K-Means clustering algorithm for numerical data, an extension to the standard fuzzy K-Means algorithm by introducing a penalty term to the objective function to make the clustering process not sensitive to the initial cluster centers. The new algorithm can produce more consistent clustering results from different sets of initial clusters centers. Combined with cluster validation techniques, the new algorithm can determine the number of clusters in a data set, which is a well-known problem in K-Means clustering. Experimental results on synthetic data sets (2 to 5 dimensions, 500 to 5,000 objects and 3 to 7 clusters), the BIRCH two-dimensional data set of 20,000 objects and 100 cluster0and the WINE data set of 178 objects, 17 dimensions, and 3 clusters from UCI have demonstrated the effectiveness of the new algorithm in producing consistent clustering results and determining the correct number of clusters in different data sets, some with overlapping inherent clusters.
引用
收藏
页码:1519 / 1534
页数:16
相关论文
共 50 条
  • [1] NSS-AKmeans: An Agglomerative Fuzzy K-Means Clustering Method with Automatic Selection of Cluster Number
    Zhang, Yanfeng
    Xu, Xiaofei
    Ye, Yunming
    [J]. 2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 2, 2010, : 32 - 38
  • [2] An improved Agglomerative levels K-means clustering algorithm
    Yu Jiankun
    Guo Jun
    [J]. 2014 INTERNATIONAL CONFERENCE ON MANAGEMENT OF E-COMMERCE AND E-GOVERNMENT (ICMECG), 2014, : 221 - 224
  • [3] Selection of Optimal Number of Clusters and Centroids for K-means and Fuzzy C-means Clustering: A Review
    Pugazhenthi, A.
    Kumar, Lakshmi Sutha
    [J]. PROCEEDINGS OF THE 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND SECURITY (ICCCS-2020), 2020,
  • [4] Variable Weighting in Fuzzy k-Means Clustering to Determine the Number of Clusters
    Khan, Imran
    Luo, Zongwei
    Huang, Joshua Zhexue
    Shahzad, Waseem
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (09) : 1838 - 1853
  • [5] Setting the number of clusters in K-means clustering
    Huh, MH
    [J]. RECENT ADVANCES IN STATISTICAL RESEARCH AND DATA ANALYSIS, 2002, : 115 - 124
  • [6] Choosing the Number of Clusters in K-Means Clustering
    Steinley, Douglas
    Brusco, Michael J.
    [J]. PSYCHOLOGICAL METHODS, 2011, 16 (03) : 285 - 297
  • [7] Seed selection algorithm through K-means on optimal number of clusters
    Kuntal Chowdhury
    Debasis Chaudhuri
    Arup Kumar Pal
    Ashok Samal
    [J]. Multimedia Tools and Applications, 2019, 78 : 18617 - 18651
  • [8] Seed selection algorithm through K-means on optimal number of clusters
    Chowdhury, Kuntal
    Chaudhuri, Debasis
    Pal, Arup Kumar
    Samal, Ashok
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (13) : 18617 - 18651
  • [9] A Fuzzy Clustering Algorithm Based on K-means
    Yan, Zhen
    Pi, Dechang
    [J]. ECBI: 2009 INTERNATIONAL CONFERENCE ON ELECTRONIC COMMERCE AND BUSINESS INTELLIGENCE, PROCEEDINGS, 2009, : 523 - 528
  • [10] Modified fuzzy gap statistic for estimating preferable number of clusters in fuzzy k-means clustering
    Arima, Chinatsu
    Hakamada, Kazumi
    Okamoto, Masahiro
    Hanai, Taizo
    [J]. JOURNAL OF BIOSCIENCE AND BIOENGINEERING, 2008, 105 (03) : 273 - 281