Comparison of the performance of center-based clustering algorithms

被引:0
|
作者
Zhang, B [1 ]
机构
[1] Hewlett Packard Res Labs, Palo Alto, CA 94304 USA
关键词
clustering; K-means; K-Harmonic Means; expectation-maximization; data mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Center-based clustering algorithms like K-means, and EM are one of the most popular classes of clustering algorithms in use today. The author developed another variation in this family K-Harmonic Means (KHM). It has been demonstrated using a small number of "benchmark" datasets that KHM is more robust than K-means and EM. In this paper, we compare their performance statistically. We run K-means, K-Harmonic Means and EM on each of 3600 pairs of (dataset, initialization) to compare the statistical average and variation of the performance of these algorithms. The results axe that, for low dimensional datasets, KHM performs consistently better than KM, and KM performs consistently better than EM over a large variation of clustered-ness of the datasets and a large variation of initializations. Some of the reasons that contributed to this difference are explained.
引用
收藏
页码:63 / 74
页数:12
相关论文
共 50 条
  • [21] Performance Comparison of Clustering Algorithms on Scientific Publications
    Parlina, Anne
    Ramli, Kalamullah
    [J]. ADVANCED SCIENCE LETTERS, 2017, 23 (04) : 3730 - 3732
  • [22] Accurate Recasting of Parameter Estimation Algorithms Using Sufficient Statistics for Efficient Parallel Speed-Up: Demonstrated for Center-Based Data Clustering Algorithms
    Zhang, Bin
    Hsu, Meichun
    Forman, George
    [J]. LECTURE NOTES IN COMPUTER SCIENCE <D>, 2000, 1910 : 243 - 254
  • [23] Stronger Convergence Results for the Center-Based Fuzzy Clustering With Convex Divergence Measure
    Saha, Arkajyoti
    Das, Swagatam
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (12) : 4229 - 4242
  • [24] A Structural Theorem for Center-Based Clustering in High-Dimensional Euclidean Space
    Shenmaier, Vladimir
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 284 - 295
  • [25] One-dimensional center-based l1-clustering method
    Kristian Sabo
    Rudolf Scitovski
    Ivan Vazler
    [J]. Optimization Letters, 2013, 7 : 5 - 22
  • [26] One-dimensional center-based l1-clustering method
    Sabo, Kristian
    Scitovski, Rudolf
    Vazler, Ivan
    [J]. OPTIMIZATION LETTERS, 2013, 7 (01) : 5 - 22
  • [27] A Performance Comparison of Big Data Processing Platform Based on Parallel Clustering Algorithms
    Hai, Mo
    Zhang, Yuejing
    Li, Haifeng
    [J]. 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2018, 139 : 127 - 135
  • [28] GERONTOLOGY - CENTER-BASED APPROACH
    BELLIS, JM
    POOLE, LH
    [J]. NEW DIRECTIONS FOR COMMUNITY COLLEGES, 1979, (27) : 15 - 22
  • [29] k-CCM: A Center-Based Algorithm for Clustering Categorical Data with Missing Values
    Dinh, Duy-Tai
    Huynh, Van-Nam
    [J]. MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2018), 2018, 11144 : 267 - 279
  • [30] Performance Comparison of Two Algorithms for Arbitrary Shapes Clustering
    Khader, Mariam
    Al-Naymat, Ghazi
    [J]. 2019 INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2019, : 20 - 26