Sparse kernel k-means for high-dimensional data

被引:2
|
作者
Guan, Xin [1 ]
Terada, Yoshikazu [1 ,2 ]
机构
[1] Osaka Univ, Grad Sch Engn Sci, 1 3 Machikaneyamacho, Toyonaka, Osaka 5600043, Japan
[2] RIKEN Ctr Adv Intelligence Project AIP, 1 4 1 Nihonbashi,Chuo ku, Tokyo 1030027, Japan
关键词
Clustering; Feature selection; Kernel method; FEATURE-SELECTION METHOD;
D O I
10.1016/j.patcog.2023.109873
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The kernel k-means method usually loses its power when clustering high-dimensional data, due to a large number of irrelevant features. We propose a novel sparse kernel k-means clustering (SKKM) to extend the advantages of kernel k-means to the high-dimensional cases. We assign each feature a 0-1 indicator and optimize an equivalent kernel k-means loss function while penalizing the sum of the indicators. An alternating minimization algorithm is proposed to estimate both the class labels and the feature indicators. We prove the consistency of both clustering and feature selection of the proposed method. In addition, we apply the proposed framework to the normalized cut. In the numerical experiments, we demonstrate that the proposed method provides better/comparable performance compared to the existing high-dimensional clustering methods.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Robust and sparse k-means clustering for high-dimensional data
    Brodinova, Sarka
    Filzmoser, Peter
    Ortner, Thomas
    Breiteneder, Christian
    Rohm, Maia
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (04) : 905 - 932
  • [2] Robust and sparse k-means clustering for high-dimensional data
    Šárka Brodinová
    Peter Filzmoser
    Thomas Ortner
    Christian Breiteneder
    Maia Rohm
    [J]. Advances in Data Analysis and Classification, 2019, 13 : 905 - 932
  • [3] The Sparse MinMax k-Means Algorithm for High-Dimensional Clustering
    Dey, Sayak
    Das, Swagatam
    Mallipeddi, Rammohan
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2103 - 2110
  • [4] An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data
    Jing, Liping
    Ng, Michael K.
    Huang, Joshua Zhexue
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (08) : 1026 - 1041
  • [5] Solving k-means on High-Dimensional Big Data
    Kappmeier, Jan-Philipp W.
    Schmidt, Daniel R.
    Schmidt, Melanie
    [J]. EXPERIMENTAL ALGORITHMS, SEA 2015, 2015, 9125 : 259 - 270
  • [6] SPARSE k-MEANS WITH l∞/l0 PENALTY FOR HIGH-DIMENSIONAL DATA CLUSTERING
    Chang, Xiangyu
    Wang, Yu
    Li, Rongjian
    Xu, Zongben
    [J]. STATISTICA SINICA, 2018, 28 (03) : 1265 - 1284
  • [7] Efficient High-Dimensional Kernel k-Means plus plus with Random Projection
    Chan, Jan Y. K.
    Leung, Alex Po
    Xie, Yunbo
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (15):
  • [8] Sparse K-Means with the lq(0 ≤ q < 1) Constraint for High-Dimensional Data Clustering
    Wang, Yu
    Chang, Xiangyu
    Li, Rongjian
    Xu, Zongben
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 797 - 806
  • [9] Sparse kernel k-means clustering
    Park, Beomjin
    Park, Changyi
    Hong, Sungchul
    Choi, Hosik
    [J]. JOURNAL OF APPLIED STATISTICS, 2024,
  • [10] Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data
    Wang, Xiao-Dong
    Chen, Rung-Ching
    Yan, Fei
    Zeng, Zhi-Qiang
    Hong, Chao-Qun
    [J]. IEEE ACCESS, 2019, 7 : 42639 - 42651