Estimation of the Number of Clusters Using Multiple Clustering Validity Indices

被引:0
|
作者
Kryszczuk, Krzysztof [1 ]
Hurley, Paul [1 ]
机构
[1] IBM Zurich Res Lab, Zurich, Switzerland
来源
关键词
clustering; clustering validity indices; multiple classifier; VALIDATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the challenges in unsupervised machine learning is finding the number of clusters in a dataset. Clustering Validity Indices (CVI) are popular tools used to address this problem. A large number of CVIs have been proposed, and reports that compare different CVIs suggest that no single CVI can always outperform others. Following suggestions found in prior art, in this paper we formalize the concept of using multiple CVIs for cluster number estimation in the framework of multi-classifier fusion. Using a large number of datasets, we show that decision-level fusion of multiple CVIs can lead to significant gains in accuracy in estimating the number of clusters, in particular for high-dimensional datasets with large number of clusters.
引用
收藏
页码:114 / 123
页数:10
相关论文
共 50 条
  • [1] Estimation of the Number of Clusters Using Heterogeneous Multiple Classifier System
    Ayad, Omar
    Sayed-Mouchaweh, Moamar
    Billaudel, Patrice
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT II, 2011, 6792 : 165 - 172
  • [2] An Improved Clustering Validity Index for Determining the Number of Malware Clusters
    Wang, Youyu
    Ye, Yanfang
    Chen, Haishan
    Jiang, Qingshan
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION IN COMMUNICATION, 2009, : 544 - +
  • [3] Hierarchical clustering algorithms with automatic estimation of the number of clusters
    Abe, Ryosuke
    Miyamoto, Sadaaki
    Endo, Yasunori
    Hamasuna, Yukihiro
    [J]. 2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS), 2017,
  • [4] Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters
    Tellaroli, Paola
    Bazzi, Marco
    Donato, Michele
    Brazzale, Alessandra R.
    Draghici, Sorin
    [J]. PLOS ONE, 2016, 11 (03):
  • [5] On cluster validity index for estimation of the optimal number of fuzzy clusters
    Kim, DW
    Lee, KH
    Lee, DH
    [J]. PATTERN RECOGNITION, 2004, 37 (10) : 2009 - 2025
  • [6] Word clustering with validity indices
    El Sayed, Ahmad
    Velcin, Julien
    Zighed, Djamel
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2008, 5032 : 259 - 270
  • [7] A validity measure for fuzzy clustering and its use in selecting optimal number of clusters
    Rhee, HS
    Oh, KW
    [J]. FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 1020 - 1025
  • [8] A Comparison Study of Clustering Validity Indices
    Chouikhi, Hasna
    Charrad, Malika
    Ghazzali, Nadia
    [J]. 2015 GLOBAL SUMMIT ON COMPUTER & INFORMATION TECHNOLOGY (GSCIT), 2015,
  • [9] Validity index and number of clusters
    Saad, Mohamed Fadhel
    Alimi, Adel M.
    [J]. International Journal of Computer Science Issues, 2012, 9 (1 1-3): : 52 - 57
  • [10] SingleCross-clustering: an algorithm for finding elongated clusters with automatic estimation of outliers and number of clusters
    Tellaroli, P.
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2022, 51 (05) : 2412 - 2428