On efficient model selection for sparse hard and fuzzy center-based clustering algorithms

被引:6
|
作者
Gupta, Avisek [1 ]
Das, Swagatam [1 ]
机构
[1] Indian Stat Inst, Elect & Commun Sci Unit, 203 BT Rd, Kolkata 700108, W Bengal, India
关键词
Sparse clustering; Model selection; Sparse k-means; Sparse fuzzy c-means; Bayesian information criterion; VALIDITY INDEX; C-MEANS; NUMBER;
D O I
10.1016/j.ins.2021.12.070
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The class of center-based clustering algorithms offers methods to efficiently identify clusters in data sets, making them applicable to larger data sets. While a data set may contain several features, not all of them may be equally informative or helpful towards cluster detection. Therefore, sparse center-based clustering methods offer a way to select only those features that may be useful in identifying the clusters present in a data set. However, to automatically determine the degree to which features should be selected, these methods use the Permutation Method which involves generating and clustering multiple randomly permuted data sets, leading to much higher computation costs. In this paper, we propose an improved approach towards model selection for sparse clustering by using expressions of Bayesian Information Criterion (BIC) derived for the center-based clustering methods of k-Means and Fuzzy c-Means. The derived expressions of BIC require significantly lower computation costs, yet allow us to compare and select a suitable sparse clustering among several possible sparse partitions that may have selected different subsets of features. Experiments on synthetic and real-world data sets show that using BIC for model selection leads to remarkable improvements in the identification of sparse clusterings for both Sparse k-Means and Sparse Fuzzy c-Means. (C) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:29 / 44
页数:16
相关论文
共 50 条
  • [41] Online fuzzy medoid based clustering algorithms
    Labroche, Nicolas
    NEUROCOMPUTING, 2014, 126 : 141 - 150
  • [42] Threshold accepting based fuzzy clustering algorithms
    Ravi, V.
    Bin, M. A.
    Kumar, P. Ravi
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2006, 14 (05) : 617 - 632
  • [43] EFFICIENT IMPLEMENTATION OF THE FUZZY C-MEANS CLUSTERING ALGORITHMS
    CANNON, RL
    DAVE, JV
    BEZDEK, JC
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1986, 8 (02) : 248 - 255
  • [44] Enhancing Clustering Performance: A Fuzzy Subspace Clustering Method with Local Correlation and Sparse Feature Selection
    Yan, Fei
    Wang, Xiaodong
    Hong, Longfu
    Journal of Network Intelligence, 2024, 9 (01): : 427 - 442
  • [46] Synthesis of clustering algorithms based on selection of centroids
    Sultanov, Yeskendir
    Moldagulova, Aiman
    Amirgaliev, Yedilkhan
    Sultanova, Zhaniya
    2018 18TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2018, : 243 - 246
  • [47] Hub Selection for Hub Based Clustering Algorithms
    He, Zhenfeng
    2014 11TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2014, : 479 - 484
  • [48] Energy Efficient Rough Fuzzy Set based Clustering and Cluster Head Selection for WSN
    Mondal, Sanjoy
    Dutta, Pratik
    Ghosh, Saurav
    Biswas, Utpal
    PROCEEDINGS ON 2016 2ND INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), 2016, : 439 - 444
  • [49] NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION
    Wang, Qingguo
    Shang, Charles
    Xu, Dong
    Shang, Yi
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2013, 22 (05)
  • [50] A sparse fuzzy c-means algorithm based on sparse clustering framework
    Qiu, Xianen
    Qiu, Yanyi
    Feng, Guocan
    Li, Peixing
    NEUROCOMPUTING, 2015, 157 : 290 - 295