Penalized model-based clustering with application to variable selection

被引:0
|
作者
Pan, Wei [1 ]
机构
[1] Univ Minnesota, Sch Publ Hlth, Div Biostat, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
关键词
BIC; EM; mixture model; penalized likelihood; soft-thresholding; shrinkage;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Variable selection in clustering analysis is both challenging and important. In the context of model-based clustering analysis with a common diagonal covariance matrix, which is especially suitable for "high dimension, low sample size" settings, we propose a penalized likelihood approach with an L-1 penalty function, automatically realizing variable selection via thresholding and delivering a sparse solution. We derive an EM algorithm to fit our proposed model, and propose a modified BIC as a model selection criterion to choose the number of components and the penalization parameter. A simulation study and an application to gene function prediction with gene expression profiles demonstrate the utility of our method.
引用
收藏
页码:1145 / 1164
页数:20
相关论文
共 50 条
  • [1] Variable selection in penalized model-based clustering via regularization on grouped parameters
    Xie, Benhuai
    Pan, Wei
    Shen, Xiaotong
    [J]. BIOMETRICS, 2008, 64 (03) : 921 - 930
  • [2] Variable selection for model-based clustering
    Raftery, AE
    Dean, N
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 168 - 178
  • [3] Variable selection methods for model-based clustering
    Fop, Michael
    Murphy, Thomas Brendan
    [J]. STATISTICS SURVEYS, 2018, 12 : 18 - 65
  • [4] Comparing Model Selection and Regularization Approaches to Variable Selection in Model-Based Clustering
    Celeux, Gilles
    Martin-Magniette, Marie-Laure
    Maugis-Rabusseau, Cathy
    Raftery, Adrian E.
    [J]. JOURNAL OF THE SFDS, 2014, 155 (02): : 57 - 71
  • [5] Variable selection in model-based clustering: A general variable role modeling
    Maugis, C.
    Celeux, G.
    Martin-Magniette, M. -L.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (11) : 3872 - 3882
  • [6] Penalized model-based clustering of fMRI data
    Dilernia, Andrew
    Quevedo, Karina
    Camchong, Jazmin
    Lim, Kelvin
    Pan, Wei
    Zhang, Lin
    [J]. BIOSTATISTICS, 2022, 23 (03) : 825 - 843
  • [7] Variable Selection for Skewed Model-Based Clustering: Application to the Identification of Novel Sleep Phenotypes
    Wallace, Meredith L.
    Buysse, Daniel J.
    Germain, Anne
    Hall, Martica H.
    Iyengar, Satish
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (521) : 95 - 110
  • [8] A simple model-based approach to variable selection in classification and clustering
    Partovi Nia, Vahid
    Davison, Anthony C.
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2015, 43 (02): : 157 - 175
  • [9] Variable selection for model-based high-dimensional clustering
    Wang, Sijian
    Zhu, Ji
    [J]. PREDICTION AND DISCOVERY, 2007, 443 : 177 - +
  • [10] Variable selection for model-based high-dimensional clustering and its application to microarray data
    Wang, Sijian
    Zhu, Ji
    [J]. BIOMETRICS, 2008, 64 (02) : 440 - 448