Subspace K-means clustering

被引:0
|
作者
Marieke E. Timmerman
Eva Ceulemans
Kim De Roover
Karla Van Leeuwen
机构
[1] University of Groningen,Heymans Institute for Psychology, Psychometrics & Statistics
[2] K.U. Leuven,Educational Sciences
[3] K.U. Leuven,Parenting and Special Education
来源
Behavior Research Methods | 2013年 / 45卷
关键词
Cluster analysis; Cluster recovery; Multivariate data; Reduced ; -means; means; Factorial ; -means; Mixtures of factor analyzers; MCLUST;
D O I
暂无
中图分类号
学科分类号
摘要
To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the existing related clustering methods, including deterministic, stochastic, and unsupervised learning approaches. To evaluate subspace K-means, we performed a comparative simulation study, in which we manipulated the overlap of subspaces, the between-cluster variance, and the error variance. The study shows that the subspace K-means algorithm is sensitive to local minima but that the problem can be reasonably dealt with by using partitions of various cluster procedures as a starting point for the algorithm. Subspace K-means performs very well in recovering the true clustering across all conditions considered and appears to be superior to its competitor methods: K-means, reduced K-means, factorial K-means, mixtures of factor analyzers (MFA), and MCLUST. The best competitor method, MFA, showed a performance similar to that of subspace K-means in easy conditions but deteriorated in more difficult ones. Using data from a study on parental behavior, we show that subspace K-means analysis provides a rich insight into the cluster characteristics, in terms of both the relative positions of the clusters (via the centroids) and the shape of the clusters (via the within-cluster residuals).
引用
收藏
页码:1011 / 1023
页数:12
相关论文
共 50 条
  • [1] Subspace K-means clustering
    Timmerman, Marieke E.
    Ceulemans, Eva
    De Roover, Kim
    Van Leeuwen, Karla
    [J]. BEHAVIOR RESEARCH METHODS, 2013, 45 (04) : 1011 - 1023
  • [2] A Heuristically Weighting K-Means Algorithm for Subspace Clustering
    Li, Boyang
    Jiang, Qingshan
    Chen, Lifei
    [J]. 2008 2ND INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY AND IDENTIFICATION, 2008, : 268 - +
  • [3] Subspace clustering by directly solving Discriminative K-means
    Gao, Chenhui
    Chen, Wenzhi
    Nie, Feiping
    Yu, Weizhong
    Yan, Feihu
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 252
  • [4] Supplier categorization with K-means type subspace clustering
    Zhang, XJ
    Huang, JZ
    Qian, DP
    Xu, J
    Jing, LP
    [J]. FRONTIERS OF WWW RESEARCH AND DEVELOPMENT - APWEB 2006, PROCEEDINGS, 2006, 3841 : 226 - 237
  • [5] On the performance of feature weighting K-means for text subspace clustering
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZX
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
  • [6] Flexible Subspace Clustering: A Joint Feature Selection and K-Means Clustering Framework
    Long, Zhong-Zhen
    Xu, Guoxia
    Du, Jiao
    Zhu, Hu
    Yan, Taiyu
    Yu, Yu-Feng
    [J]. BIG DATA RESEARCH, 2021, 23
  • [7] Time series k-means: A new k-means type smooth subspace clustering for time series data
    Huang, Xiaohui
    Ye, Yunming
    Xiong, Liyan
    Lau, Raymond Y. K.
    Jiang, Nan
    Wang, Shaokai
    [J]. INFORMATION SCIENCES, 2016, 367 : 1 - 13
  • [8] Sparse Subspace K-means
    Diallo, Abdoul Wahab
    Niang, Ndeye
    Ouattara, Mory
    [J]. 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 678 - 685
  • [9] Subspace clustering of text documents with feature weighting K-means algorithm
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZ
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 802 - 812
  • [10] A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets
    Ahmad, Amir
    Dey, Lipika
    [J]. PATTERN RECOGNITION LETTERS, 2011, 32 (07) : 1062 - 1069