Indexes to Find the Optimal Number of Clusters in a Hierarchical Clustering

被引:1
|
作者
David Martin-Fernandez, Jose [1 ]
Maria Luna-Romera, Jose [1 ]
Pontes, Beatriz [1 ]
Riquelme-Santos, Jose C. [1 ]
机构
[1] Univ Seville, E-41012 Seville, Spain
关键词
Machine Learning; Hierarchical clustering; Internal validation indexes; BIG DATA; ALGORITHMS;
D O I
10.1007/978-3-030-20055-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering analysis is one of the most commonly used techniques for uncovering patterns in data mining. Most clustering methods require establishing the number of clusters beforehand. However, due to the size of the data currently used, predicting that value is at a high computational cost task in most cases. In this article, we present a clustering technique that avoids this requirement, using hierarchical clustering. There are many examples of this procedure in the literature, most of them focusing on the dissociative or descending subtype, while in this article we cover the agglomerative or ascending subtype. Being more expensive in computational and temporal cost, it nevertheless allows us to obtain very valuable information, regarding elements membership to clusters and their groupings, that is to say, their dendrogram. Finally, several sets of data have been used, varying their dimensionality. For each of them, we provide the calculations of internal validation indexes to test the algorithm developed, studying which of them provides better results to obtain the best possible clustering.
引用
收藏
页码:3 / 13
页数:11
相关论文
共 50 条
  • [1] A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering
    Yunjae Jung
    Haesun Park
    Ding-Zhu Du
    Barry L. Drake
    [J]. Journal of Global Optimization, 2003, 25 (1) : 91 - 111
  • [2] A decision criterion for the optimal number of clusters in hierarchical clustering
    Jung, J
    Park, H
    Du, DZ
    Drake, BL
    [J]. JOURNAL OF GLOBAL OPTIMIZATION, 2003, 25 (01) : 91 - 111
  • [3] A New Algorithm for Fuzzy Clustering Able to Find the Optimal Number of Clusters
    Abidi, Balkis
    Ben Yahia, Sadok
    Bouzeghoub, Amel
    [J]. 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 806 - 813
  • [4] Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering
    Zhou, Shibing
    Xu, Zhenyuan
    Liu, Fei
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (12) : 3007 - 3017
  • [5] On the optimal number of clusters in histogram clustering
    Buhmann, JM
    Held, M
    [J]. CLASSIFICATION, AUTOMATION, AND NEW MEDIA, 2002, : 37 - 45
  • [6] Automatic identification of the number of clusters in hierarchical clustering
    Karna, Ashutosh
    Gibert, Karina
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01): : 119 - 134
  • [7] Automatic identification of the number of clusters in hierarchical clustering
    Ashutosh Karna
    Karina Gibert
    [J]. Neural Computing and Applications, 2022, 34 : 119 - 134
  • [8] Hierarchical clustering algorithms with automatic estimation of the number of clusters
    Abe, Ryosuke
    Miyamoto, Sadaaki
    Endo, Yasunori
    Hamasuna, Yukihiro
    [J]. 2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS), 2017,
  • [9] The upper bound of the optimal number of clusters in fuzzy clustering
    于剑
    程乾生
    [J]. Science China(Information Sciences), 2001, (02) : 119 - 125
  • [10] Clustering of fMRI data: the elusive optimal number of clusters
    Seghier, Mohamed L.
    [J]. PEERJ, 2018, 6