Inertia-Based Indices to Determine the Number of Clusters in K-Means: An Experimental Evaluation

被引:4
|
作者
Rykov, Andrei [1 ]
de Amorim, Renato Cordeiro [2 ]
Makarenkov, Vladimir [3 ,4 ]
Mirkin, Boris [1 ,5 ]
机构
[1] Natl Res Univ Higher Sch Econ, Dept Data Anal & Machine Intelligence, Moscow 101000, Russia
[2] Univ Essex, Comp Sci & Elect Engn Dept, Wivenhoe CO4 3SQ, England
[3] Imagia Cybernet, Montreal, PQ H3C 3P8, Canada
[4] Mila Quebec AI Inst, Montreal, PQ H2S 3H1, Canada
[5] Univ London, Dept Comp Sci & Informat Syst, London WC1E 7HX, England
基金
加拿大自然科学与工程研究理事会;
关键词
Indexes; Clustering algorithms; Euclidean distance; Amplitude modulation; Partitioning algorithms; Computer science; K-means; number of clusters; inertia; elbow method; Calinski-Harabasz index; Hartigan rule; ALGORITHM;
D O I
10.1109/ACCESS.2024.3350791
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper gives an experimentally supported review and comparison of several indices based on the conventional K-means inertia criterion for determining the number of clusters, K, in datasets, using the popular Silhouette width index as a benchmark. Our experiments involve a novel version of the Elbow index, defined using values of K two or three steps apart. We also discuss alternative ways of computing the inertia and summarizing its values. Even though there are no overall winners in our experiments, some of our results are very conclusive and can be used as a guide for indices determining the number of clusters in K-means.
引用
收藏
页码:11761 / 11773
页数:13
相关论文
共 50 条
  • [41] Apache Mahout's k-Means vs. Fuzzy k-Means Performance Evaluation
    Xhafa, Fatos
    Bogza, Adriana
    Caballe, Santi
    Barolli, Leonard
    [J]. 2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2016, : 110 - 116
  • [42] Classification of countries based on development indices by using K-means and grey relational analysis
    Basel, Sayel
    Gopakumar, K. U.
    Rao, R. Prabhakara
    [J]. GEOJOURNAL, 2022, 87 (05) : 3915 - 3933
  • [43] A GENERALIZED k-MEANS PROBLEM FOR CLUSTERING AND AN ADMM-BASED k-MEANS ALGORITHM
    Ling, Liyun
    Gu, Yan
    Zhang, Su
    Wen, Jie
    [J]. JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2024, 20 (06) : 2089 - 2115
  • [44] Automatic detection of outliers and the number of clusters in k-means clustering via Chebyshev-type inequalities
    Olukanmi, Peter
    Nelwamondo, Fulufhelo
    Marwala, Tshilidzi
    Twala, Bhekisipho
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08): : 5939 - 5958
  • [45] Classification of countries based on development indices by using K-means and grey relational analysis
    Sayel Basel
    K. U. Gopakumar
    R. Prabhakara Rao
    [J]. GeoJournal, 2022, 87 : 3915 - 3933
  • [46] Kernel Penalized K-means: A feature selection method based on Kernel K-means
    Maldonado, Sebastian
    Carrizosa, Emilio
    Weber, Richard
    [J]. INFORMATION SCIENCES, 2015, 322 : 150 - 160
  • [47] Automatic detection of outliers and the number of clusters in k-means clustering via Chebyshev-type inequalities
    Peter Olukanmi
    Fulufhelo Nelwamondo
    Tshilidzi Marwala
    Bhekisipho Twala
    [J]. Neural Computing and Applications, 2022, 34 : 5939 - 5958
  • [48] K-means Split Revisited: Well-grounded Approach and Experimental Evaluation
    Grigorev, Valentin
    Chernishev, George
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 2251 - 2252
  • [49] Determination of the number of clusters used in fuzzy inference systems by means of K-means and modeling of dam volume: Kestel dam example
    Kucukerdem, Tulay Sugra
    Kilit, Murat
    Saplioglu, Kemal
    [J]. PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2019, 25 (08): : 962 - 967
  • [50] Technical Note: Using k-means clustering to determine the number and position of isocenters in MLC-based multiple target intracranial radiosurgery
    Yock, Adam D.
    Kim, Gwe-Ya
    [J]. JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2017, 18 (05): : 351 - 357