A LASSO-penalized BIC for mixture model selection

被引:20
|
作者
Bhattacharya, Sakyajit [1 ]
McNicholas, Paul D. [1 ]
机构
[1] Univ Guelph, Dept Math & Stat, Guelph, ON N1G 2W1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
BIC; LASSO; Mixture models; Model-based clustering; Model selection; VARIABLE SELECTION; INFORMATION CRITERION; ORACLE PROPERTIES; EM ALGORITHM; LIKELIHOOD; SHRINKAGE; CHOICE;
D O I
10.1007/s11634-013-0155-1
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The efficacy of family-based approaches to mixture model-based clustering and classification depends on the selection of parsimonious models. Current wisdom suggests the Bayesian information criterion (BIC) for mixture model selection. However, the BIC has well-known limitations, including a tendency to overestimate the number of components as well as a proclivity for underestimating, often drastically, the number of components in higher dimensions. While the former problem might be soluble by merging components, the latter is impossible to mitigate in clustering and classification applications. In this paper, a LASSO-penalized BIC (LPBIC) is introduced to overcome this problem. This approach is illustrated based on applications of extensions of mixtures of factor analyzers, where the LPBIC is used to select both the number of components and the number of latent factors. The LPBIC is shown to match or outperform the BIC in several situations.
引用
收藏
页码:45 / 61
页数:17
相关论文
共 50 条
  • [1] A LASSO-penalized BIC for mixture model selection
    Sakyajit Bhattacharya
    Paul D. McNicholas
    [J]. Advances in Data Analysis and Classification, 2014, 8 : 45 - 61
  • [2] A model selection method based on the adaptive LASSO-penalized GEE and weighted Gaussian pseudo-likelihood BIC in longitudinal robust analysis
    Zhang, Jiamao
    Xu, Jianwen
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2018, 47 (23) : 5779 - 5794
  • [3] A globally convergent algorithm for lasso-penalized mixture of linear regression models
    Lloyd-Jones, Luke R.
    Nguyen, Hien D.
    McLachlan, Geoffrey J.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 119 : 19 - 38
  • [4] Fast Expectation Propagation for Heteroscedastic, Lasso-Penalized, and Quantile Regression
    Zhou, Jackson
    Ormerod, John T.
    Grazian, Clara
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [5] Mixture model selection via hierarchical BIC
    Zhao, Jianhua
    Jin, Libin
    Shi, Lei
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 88 : 139 - 153
  • [6] LASSO-penalized clusterwise linear regression modelling: a two-step approach
    Di Mari, Roberto
    Rocci, Roberto
    Gattone, Stefano Antonio
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2023, 93 (18) : 3235 - 3258
  • [7] Measuring associations between the microbiota and repeated measures of continuous clinical variables using a lasso-penalized generalized linear mixed model
    Tipton, Laura
    Cuenco, Karen T.
    Huang, Laurence
    Greenblatt, Ruth M.
    Kleerup, Eric
    Sciurba, Frank
    Duncan, Steven R.
    Donahoe, Michael P.
    Morris, Alison
    Ghedin, Elodie
    [J]. BIODATA MINING, 2018, 11
  • [8] Entropy penalized automated model selection on Gaussian mixture
    Ma, JW
    Wang, TJ
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2004, 18 (08) : 1501 - 1512
  • [9] Measuring associations between the microbiota and repeated measures of continuous clinical variables using a lasso-penalized generalized linear mixed model
    Laura Tipton
    Karen T. Cuenco
    Laurence Huang
    Ruth M. Greenblatt
    Eric Kleerup
    Frank Sciurba
    Steven R. Duncan
    Michael P. Donahoe
    Alison Morris
    Elodie Ghedin
    [J]. BioData Mining, 11
  • [10] Handling high predictor dimensionality in slope-unit-based landslide susceptibility models through LASSO-penalized Generalized Linear Model
    Camilo, Daniela Castro
    Lombardo, Luigi
    Mai, P. Martin
    Dou, Jie
    Huser, Raphael
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2017, 97 : 145 - 156