Sparse probabilistic K-means

被引:7
|
作者
Jung, Yoon Mo [1 ]
Whang, Joyce Jiyoung [2 ]
Yun, Sangwoon [3 ]
机构
[1] Sungkyunkwan Univ, Dept Math, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Comp Sci & Engn, Suwon 16419, South Korea
[3] Sungkyunkwan Univ, Dept Math Educ, Seoul 03063, South Korea
基金
新加坡国家研究基金会;
关键词
Clustering; K-means; Alternating minimization; SELECTION;
D O I
10.1016/j.amc.2020.125328
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The goal of clustering is to partition a set of data points into groups of similar data points, called clusters. Clustering algorithms can be classified into two categories: hard and soft clustering. Hard clustering assigns each data point to one cluster exclusively. On the other hand, soft clustering allows probabilistic assignments to clusters. In this paper, we propose a new model which combines the benefits of these two models: clarity of hard clustering and probabilistic assignments of soft clustering. Since the majority of data usually have a clear association, only a few points may require a probabilistic interpretation. Thus, we apply the l(1) norm constraint to impose sparsity on probabilistic assignments. Moreover, we also incorporate outlier detection in our clustering model to simultaneously detect outliers which can cause serious problems in statistical analyses. To optimize the model, we introduce an alternating minimization method and prove its convergence. Numerical experiments and comparisons with existing models show the soundness and effectiveness of the proposed model. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Sparse Weighted K-Means for Groups of Mixed-Type Variables
    Chavent, Marie
    Cottrell, Marie
    Lacaille, Jerome
    Mourer, Alex
    Olteanu, Madalina
    ADVANCES IN SELF-ORGANIZING MAPS, LEARNING VECTOR QUANTIZATION, CLUSTERING AND DATA VISUALIZATION: DEDICATED TO THE MEMORY OF TEUVO KOHONEN, WSOM+ 2022, 2022, 533 : 1 - 10
  • [32] Robust and sparse k-means clustering for high-dimensional data
    Brodinova, Sarka
    Filzmoser, Peter
    Ortner, Thomas
    Breiteneder, Christian
    Rohm, Maia
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (04) : 905 - 932
  • [33] K-means Recovers ICA Filters when Independent Components are Sparse
    Vinnikov, Alon
    Shalev-Shwartz, Shai
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 712 - 720
  • [34] Simple and Scalable Sparse k-means Clustering via Feature Ranking
    Zhang, Zhiyue
    Lange, Kenneth
    Xu, Jason
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [35] Sparse Sequential Generalization of K-means for dictionary training on noisy signals
    Sahoo, Sujit Kumar
    Makur, Anamitra
    SIGNAL PROCESSING, 2016, 129 : 62 - 66
  • [36] The Sparse MinMax k-Means Algorithm for High-Dimensional Clustering
    Dey, Sayak
    Das, Swagatam
    Mallipeddi, Rammohan
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2103 - 2110
  • [37] Subarray partition based on sparse array weighted K-means clustering
    Zhao, Jiayu
    Huang, Jianming
    Cui, Yansong
    Zhang, Naibo
    Wang, Yuxuan
    Wang, Zilai
    ELECTRONICS LETTERS, 2024, 60 (18)
  • [38] Data Imputation with an Improved Robust and Sparse Fuzzy K-Means Algorithm
    Scully-Allison, Connor
    Wu, Rui
    Dascalu, Sergiu M.
    Barford, Lee
    Harris, Frederick C., Jr.
    16TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY-NEW GENERATIONS (ITNG 2019), 2019, 800 : 299 - 306
  • [39] Sparse kernel PCA by Kernel K-means and preimage reconstruction algorithms
    Marukatat, Sanparith
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 454 - 463
  • [40] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176