Variable selection methods for model-based clustering

被引:64
|
作者
Fop, Michael [1 ]
Murphy, Thomas Brendan [1 ]
机构
[1] Univ Coll Dublin, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
Gaussian mixture model; latent class analysis; model-based clustering; R packages; variable selection;
D O I
10.1214/18-SS119
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Model-based clustering is a popular approach for clustering multivariate data which has seen applications in numerous fields. Nowadays, high-dimensional data are more and more common and the model-based clustering approach has adapted to deal with the increasing dimensionality. In particular, the development of variable selection techniques has received a lot of attention and research effort in recent years. Even for small size problems, variable selection has been advocated to facilitate the interpretation of the clustering results. This review provides a summary of the methods developed for variable selection in model-based clustering. Existing R packages implementing the different methods are indicated and illustrated in application to two data analysis examples.
引用
收藏
页码:18 / 65
页数:48
相关论文
共 50 条
  • [1] Variable selection for model-based clustering
    Raftery, AE
    Dean, N
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 168 - 178
  • [2] Penalized model-based clustering with application to variable selection
    Pan, Wei
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2007, 8 : 1145 - 1164
  • [3] Comparing Model Selection and Regularization Approaches to Variable Selection in Model-Based Clustering
    Celeux, Gilles
    Martin-Magniette, Marie-Laure
    Maugis-Rabusseau, Cathy
    Raftery, Adrian E.
    [J]. JOURNAL OF THE SFDS, 2014, 155 (02): : 57 - 71
  • [4] Variable selection in model-based clustering: A general variable role modeling
    Maugis, C.
    Celeux, G.
    Martin-Magniette, M. -L.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (11) : 3872 - 3882
  • [5] A simple model-based approach to variable selection in classification and clustering
    Partovi Nia, Vahid
    Davison, Anthony C.
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2015, 43 (02): : 157 - 175
  • [6] Variable selection for model-based high-dimensional clustering
    Wang, Sijian
    Zhu, Ji
    [J]. PREDICTION AND DISCOVERY, 2007, 443 : 177 - +
  • [7] Pairwise Variable Selection for High-Dimensional Model-Based Clustering
    Guo, Jian
    Levina, Elizaveta
    Michailidis, George
    Zhu, Ji
    [J]. BIOMETRICS, 2010, 66 (03) : 793 - 804
  • [8] Variable selection in model-based clustering and discriminant analysis with a regularization approach
    Gilles Celeux
    Cathy Maugis-Rabusseau
    Mohammed Sedki
    [J]. Advances in Data Analysis and Classification, 2019, 13 : 259 - 278
  • [9] Variable selection in model-based clustering and discriminant analysis with a regularization approach
    Celeux, Gilles
    Maugis-Rabusseau, Cathy
    Sedki, Mohammed
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (01) : 259 - 278
  • [10] Variable selection in model-based clustering using multilocus genotype data
    Toussile W.
    Gassiat E.
    [J]. Advances in Data Analysis and Classification, 2009, 3 (2) : 109 - 134