Sparse kernel k-means for high-dimensional data

被引：2

作者：

Guan, Xin ^{[1
]}

Terada, Yoshikazu ^{[1
,2
]}

机构：

[1] Osaka Univ, Grad Sch Engn Sci, 1 3 Machikaneyamacho, Toyonaka, Osaka 5600043, Japan

[2] RIKEN Ctr Adv Intelligence Project AIP, 1 4 1 Nihonbashi,Chuo ku, Tokyo 1030027, Japan

来源：

PATTERN RECOGNITION | 2023年 / 144卷

关键词：

Clustering; Feature selection; Kernel method; FEATURE-SELECTION METHOD;

D O I：

10.1016/j.patcog.2023.109873

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The kernel k-means method usually loses its power when clustering high-dimensional data, due to a large number of irrelevant features. We propose a novel sparse kernel k-means clustering (SKKM) to extend the advantages of kernel k-means to the high-dimensional cases. We assign each feature a 0-1 indicator and optimize an equivalent kernel k-means loss function while penalizing the sum of the indicators. An alternating minimization algorithm is proposed to estimate both the class labels and the feature indicators. We prove the consistency of both clustering and feature selection of the proposed method. In addition, we apply the proposed framework to the normalized cut. In the numerical experiments, we demonstrate that the proposed method provides better/comparable performance compared to the existing high-dimensional clustering methods.

引用

页数：11

共 50 条

[1] Robust and sparse k-means clustering for high-dimensional data
Brodinova, Sarka
Filzmoser, Peter
Ortner, Thomas
Breiteneder, Christian
Rohm, Maia
[J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (04) : 905 - 932
[2] Robust and sparse k-means clustering for high-dimensional data
Šárka Brodinová
Peter Filzmoser
Thomas Ortner
Christian Breiteneder
Maia Rohm
[J]. Advances in Data Analysis and Classification, 2019, 13 : 905 - 932
[3] The Sparse MinMax k-Means Algorithm for High-Dimensional Clustering
Dey, Sayak
Das, Swagatam
Mallipeddi, Rammohan
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2103 - 2110
[4] An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data
Jing, Liping
Ng, Michael K.
Huang, Joshua Zhexue
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (08) : 1026 - 1041
[5] Solving k-means on High-Dimensional Big Data
Kappmeier, Jan-Philipp W.
Schmidt, Daniel R.
Schmidt, Melanie
[J]. EXPERIMENTAL ALGORITHMS, SEA 2015, 2015, 9125 : 259 - 270
[6] SPARSE k-MEANS WITH l∞/l0 PENALTY FOR HIGH-DIMENSIONAL DATA CLUSTERING
Chang, Xiangyu
Wang, Yu
Li, Rongjian
Xu, Zongben
[J]. STATISTICA SINICA, 2018, 28 (03) : 1265 - 1284
[7] Efficient High-Dimensional Kernel k-Means plus plus with Random Projection
Chan, Jan Y. K.
Leung, Alex Po
Xie, Yunbo
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (15):
[8] Sparse K-Means with the lq(0 ≤ q < 1) Constraint for High-Dimensional Data Clustering
Wang, Yu
Chang, Xiangyu
Li, Rongjian
Xu, Zongben
[J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 797 - 806
[9] Sparse kernel k-means clustering
Park, Beomjin
Park, Changyi
Hong, Sungchul
Choi, Hosik
[J]. JOURNAL OF APPLIED STATISTICS, 2024,
[10] Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data
Wang, Xiao-Dong
Chen, Rung-Ching
Yan, Fei
Zeng, Zhi-Qiang
Hong, Chao-Qun
[J]. IEEE ACCESS, 2019, 7 : 42639 - 42651

← 1 2 3 4 5 →