Sparse probabilistic K-means

被引:7
|
作者
Jung, Yoon Mo [1 ]
Whang, Joyce Jiyoung [2 ]
Yun, Sangwoon [3 ]
机构
[1] Sungkyunkwan Univ, Dept Math, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Comp Sci & Engn, Suwon 16419, South Korea
[3] Sungkyunkwan Univ, Dept Math Educ, Seoul 03063, South Korea
基金
新加坡国家研究基金会;
关键词
Clustering; K-means; Alternating minimization; SELECTION;
D O I
10.1016/j.amc.2020.125328
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The goal of clustering is to partition a set of data points into groups of similar data points, called clusters. Clustering algorithms can be classified into two categories: hard and soft clustering. Hard clustering assigns each data point to one cluster exclusively. On the other hand, soft clustering allows probabilistic assignments to clusters. In this paper, we propose a new model which combines the benefits of these two models: clarity of hard clustering and probabilistic assignments of soft clustering. Since the majority of data usually have a clear association, only a few points may require a probabilistic interpretation. Thus, we apply the l(1) norm constraint to impose sparsity on probabilistic assignments. Moreover, we also incorporate outlier detection in our clustering model to simultaneously detect outliers which can cause serious problems in statistical analyses. To optimize the model, we introduce an alternating minimization method and prove its convergence. Numerical experiments and comparisons with existing models show the soundness and effectiveness of the proposed model. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Sparse Subspace K-means
    Diallo, Abdoul Wahab
    Niang, Ndeye
    Ouattara, Mory
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 678 - 685
  • [2] Sparse Embedded k-Means Clustering
    Liu, Weiwei
    Shen, Xiaobo
    Tsang, Ivor W.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [3] Sparse kernel k-means clustering
    Park, Beomjin
    Park, Changyi
    Hong, Sungchul
    Choi, Hosik
    JOURNAL OF APPLIED STATISTICS, 2025, 52 (01) : 158 - 182
  • [4] Kernel Probabilistic K-Means Clustering
    Liu, Bowen
    Zhang, Ting
    Li, Yujian
    Liu, Zhaoying
    Zhang, Zhilin
    SENSORS, 2021, 21 (05) : 1 - 16
  • [5] On Probabilistic k-Richness of the k-Means Algorithms
    Klopotek, Robert A.
    Klopotek, Mieczyslaw A.
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 259 - 271
  • [6] Wasserstein k-means with sparse simplex projection
    Fukunaga, Takumi
    Kasai, Hiroyuki
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1627 - 1634
  • [7] Improved Sparse Prototyping for Relational K-means
    Cherki, Safouane
    Rastin, Parisa
    Cabanes, Guenael
    Basarab, Matei
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [8] Probabilistic reduced K-means cluster analysis
    Lee, Seunghoon
    Song, Juwon
    KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (06) : 905 - 922
  • [9] Efficient Sparse Spherical k-Means for Document Clustering
    Knittel, Johannes
    Koch, Steffen
    Ertl, Thomas
    PROCEEDINGS OF THE 21ST ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG '21), 2021,
  • [10] Deterministic Coresets for k-Means of Big Sparse Data
    Barger, Artem
    Feldman, Dan
    ALGORITHMS, 2020, 13 (04)