Penalized model-based clustering with application to variable selection

被引:0
|
作者
Pan, Wei [1 ]
机构
[1] Univ Minnesota, Sch Publ Hlth, Div Biostat, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
关键词
BIC; EM; mixture model; penalized likelihood; soft-thresholding; shrinkage;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Variable selection in clustering analysis is both challenging and important. In the context of model-based clustering analysis with a common diagonal covariance matrix, which is especially suitable for "high dimension, low sample size" settings, we propose a penalized likelihood approach with an L-1 penalty function, automatically realizing variable selection via thresholding and delivering a sparse solution. We derive an EM algorithm to fit our proposed model, and propose a modified BIC as a model selection criterion to choose the number of components and the penalization parameter. A simulation study and an application to gene function prediction with gene expression profiles demonstrate the utility of our method.
引用
收藏
页码:1145 / 1164
页数:20
相关论文
共 50 条
  • [21] Penalized Model-Based Clustering with Group-Dependent Shrinkage Estimation
    Casa, Alessandro
    Cappozzo, Andrea
    Fop, Michael
    [J]. BUILDING BRIDGES BETWEEN SOFT AND STATISTICAL METHODOLOGIES FOR DATA SCIENCE, 2023, 1433 : 73 - 78
  • [22] Group-Wise Shrinkage Estimation in Penalized Model-Based Clustering
    Alessandro Casa
    Andrea Cappozzo
    Michael Fop
    [J]. Journal of Classification, 2022, 39 : 648 - 674
  • [23] Group-Wise Shrinkage Estimation in Penalized Model-Based Clustering
    Casa, Alessandro
    Cappozzo, Andrea
    Fop, Michael
    [J]. JOURNAL OF CLASSIFICATION, 2022, 39 (03) : 648 - 674
  • [24] Variable selection for model-based clustering using the integrated complete-data likelihood
    Marbac, Matthieu
    Sedki, Mohammed
    [J]. STATISTICS AND COMPUTING, 2017, 27 (04) : 1049 - 1063
  • [25] Variable selection for model-based clustering using the integrated complete-data likelihood
    Matthieu Marbac
    Mohammed Sedki
    [J]. Statistics and Computing, 2017, 27 : 1049 - 1063
  • [26] Variable selection in model-based discriminant analysis
    Maugis, C.
    Celeux, G.
    Martin-Magniette, M-L
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2011, 102 (10) : 1374 - 1387
  • [27] A fuzzy penalized regression model with variable selection
    Kashani, M.
    Arashi, M.
    Rabiei, M. R.
    D'Urso, P.
    De Giovanni, L.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 175
  • [28] Model-based clustering of high-dimensional data: Variable selection versus facet determination
    Poon, Leonard K. M.
    Zhang, Nevin L.
    Liu, Tengfei
    Liu, April H.
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2013, 54 (01) : 196 - 215
  • [29] A DCA Based Algorithm for Feature Selection in Model-Based Clustering
    Viet Anh Nguyen
    Hoai An Le Thi
    Hoai Minh Le
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2020), PT I, 2020, 12033 : 404 - 415
  • [30] Penalized regression with model-based penalties
    Heckman, NE
    Ramsay, JO
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2000, 28 (02): : 241 - 258