K-means properties on six clustering benchmark datasets

被引:0
|
作者
Pasi Fränti
Sami Sieranoja
机构
[1] University of Eastern Finland,Machine Learning Group, School of Computing
来源
Applied Intelligence | 2018年 / 48卷
关键词
Clustering algorithms; Clustering quality; k-means; Benchmark;
D O I
暂无
中图分类号
学科分类号
摘要
This paper has two contributions. First, we introduce a clustering basic benchmark. Second, we study the performance of k-means using this benchmark. Specifically, we measure how the performance depends on four factors: (1) overlap of clusters, (2) number of clusters, (3) dimensionality, and (4) unbalance of cluster sizes. The results show that overlap is critical, and that k-means starts to work effectively when the overlap reaches 4% level.
引用
收藏
页码:4743 / 4759
页数:16
相关论文
共 50 条
  • [1] K-means properties on six clustering benchmark datasets
    Franti, Pasi
    Sieranoja, Sami
    [J]. APPLIED INTELLIGENCE, 2018, 48 (12) : 4743 - 4759
  • [2] Clustering Large Datasets by Merging K-Means Solutions
    Volodymyr Melnykov
    Semhar Michael
    [J]. Journal of Classification, 2020, 37 : 97 - 123
  • [3] Clustering large datasets using Cobweb and K-means in tandem
    Li, M
    Holmes, G
    Pfahringer, B
    [J]. AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 368 - 379
  • [4] A Study on Outlier distance and SSE with multidimensional datasets in K-means clustering
    Rajee, A. M.
    Francis, F. Sagayaraj
    [J]. 2013 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2013, : 33 - 36
  • [5] A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets
    Ahmad, Amir
    Dey, Lipika
    [J]. PATTERN RECOGNITION LETTERS, 2011, 32 (07) : 1062 - 1069
  • [6] Comparison of K-Means Clustering and Statistical Outliers in Reducing Medical Datasets
    Santhanam, T.
    Padmavathi, M. S.
    [J]. 2014 International Conference on Science Engineering and Management Research (ICSEMR), 2014,
  • [7] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    [J]. 2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [8] K-Means Cloning: Adaptive Spherical K-Means Clustering
    Hedar, Abdel-Rahman
    Ibrahim, Abdel-Monem M.
    Abdel-Hakim, Alaa E.
    Sewisy, Adel A.
    [J]. ALGORITHMS, 2018, 11 (10):
  • [9] Spectral Clustering of CRISM Datasets in Jezero Crater Using UMAP and k-Means
    Pletl, Alexander
    Fernandes, Michael
    Thomas, Nicolas
    Rossi, Angelo Pio
    Elser, Benedikt
    [J]. REMOTE SENSING, 2023, 15 (04)
  • [10] THE USE OF K-MEANS plus plus FOR APPROXIMATE SPECTRAL CLUSTERING OF LARGE DATASETS
    Yalcin, Berna
    Tasdemir, Kadim
    [J]. 2014 22ND SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2014, : 220 - 223