Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering

被引:108
|
作者
Shutaywi, Meshal [1 ]
Kachouie, Nezamoddin N. [2 ]
机构
[1] King Abdulaziz Univ, Dept Math, Rabigh 21911, Saudi Arabia
[2] Florida Inst Technol, Dept Math Sci, Melbourne, FL 32901 USA
关键词
k-means; kernel k-means; machine learning; nonlinear clustering; silhouette index; weighted clustering;
D O I
10.3390/e23060759
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Grouping the objects based on their similarities is an important common task in machine learning applications. Many clustering methods have been developed, among them k-means based clustering methods have been broadly used and several extensions have been developed to improve the original k-means clustering method such as k-means ++ and kernel k-means. K-means is a linear clustering method; that is, it divides the objects into linearly separable groups, while kernel k-means is a non-linear technique. Kernel k-means projects the elements to a higher dimensional feature space using a kernel function, and then groups them. Different kernel functions may not perform similarly in clustering of a data set and, in turn, choosing the right kernel for an application could be challenging. In our previous work, we introduced a weighted majority voting method for clustering based on normalized mutual information (NMI). NMI is a supervised method where the true labels for a training set are required to calculate NMI. In this study, we extend our previous work of aggregating the clustering results to develop an unsupervised weighting function where a training set is not available. The proposed weighting function here is based on Silhouette index, as an unsupervised criterion. As a result, a training set is not required to calculate Silhouette index. This makes our new method more sensible in terms of clustering concept.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Generalizing Correspondence Analysis for Applications in Machine Learning
    Hsu, Hsiang
    Salamatian, Salman
    Calmon, Flavio P.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9347 - 9362
  • [32] Applications of Machine Learning in Analysis of Citation Network
    Pradhan, Dinesh K.
    Chakraborty, Joyita
    Nandi, Subrata
    PROCEEDINGS OF THE 6TH ACM IKDD CODS AND 24TH COMAD, 2019, : 330 - 333
  • [33] Machine Learning Analysis of IP ID Applications
    Shulman, Haya
    Zhao, Shujie
    51ST ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS - SUPPLEMENTAL VOL (DSN 2021), 2021, : 15 - 16
  • [34] Machine Learning Applications in Medical Image Analysis
    El-Baz, Ayman
    Gimel'farb, Georgy
    Suzuki, Kenji
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2017, 2017
  • [35] Machine learning applications in cell image analysis
    Kan, Andrey
    IMMUNOLOGY AND CELL BIOLOGY, 2017, 95 (06): : 525 - 530
  • [36] Performance Of Soil Prediction Using Machine Learning For Data Clustering Methods
    Rajeshwari, M.
    Shunmuganathan, N.
    Sankarasubramanian, R.
    JOURNAL OF ALGEBRAIC STATISTICS, 2022, 13 (02) : 825 - 831
  • [37] Autonomous clustering for machine learning
    Luaces, O
    del Coz, JJ
    Quevedo, JR
    Alonso, J
    Ranilla, J
    Bahamonde, A
    FOUNDATIONS AND TOOLS FOR NEURAL MODELING, PROCEEDINGS, VOL I, 1999, 1606 : 497 - 506
  • [38] Machine Learning Approach for Sequence Clustering with Applications to Ground-Motion Selection
    Zhang, Ruiyang
    Hajjar, Jerome
    Sun, Hao
    JOURNAL OF ENGINEERING MECHANICS, 2020, 146 (06)
  • [39] Multi-agent reinforcement learning clustering algorithm based on silhouette coefficient
    Du, Peng
    Li, Fenglian
    Shao, Jianli
    NEUROCOMPUTING, 2024, 596
  • [40] Performance comparison of bio-inspired and learning-based clustering analysis with machine learning techniques for classification of EEG signals
    Prabhakar, Sunil Kumar
    Won, Dong-Ok
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6