Hypothesis Testing for Mixture Model Selection

被引:18
|
作者
Punzo, Antonio [1 ]
Browne, Ryan P. [2 ]
McNicholas, Paul D. [2 ]
机构
[1] Univ Catania, Dept Econ & Business, Catania, Italy
[2] McMaster Univ, Dept Math & Stat, Hamilton, ON, Canada
关键词
Closed testing procedures; eigen decomposition; Gaussian mixtures; homoscedasticity; likelihood-ratio tests; LIKELIHOOD RATIO TEST; EM-ALGORITHM; DISCRIMINANT-ANALYSIS; MAXIMUM-LIKELIHOOD; IDENTIFIABILITY; COMPONENTS; DENSITIES; VALUES;
D O I
10.1080/00949655.2015.1131282
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Gaussian mixture models with eigen-decomposed covariance structures, i.e. the Gaussian parsimonious clustering models (GPCM), make up the most popular family of mixture models for clustering and classification. Although the GPCM family has been used for almost 20 years, selecting the best member of the family in a given situation remains a troublesome problem. Likelihood ratio (LR) tests are developed to tackle this problem; given a number of mixture components, these LR tests compare each member of the family to the heteroscedastic model under the alternative hypothesis. Along the way, a novel maximum likelihood estimation procedure is developed for two members of the GPCM family. Simulations show that the inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gscs_a_1131282_ilm0001.gif"inline-graphic reference distribution provides a reasonable approximation for the LR statistics when the sample size is not too small and when the mixture components are separate enough; accordingly, in the remaining configurations, a parametric bootstrap approach is also discussed and evaluated. Furthermore, a closed testing procedure, having the defined LR tests as local tests, is considered to assess, in a straightforward way, a unique model in the general family. In contrast with the information criteria that are often employed in the literature as black boxes', it is only based on one subjective element, the significance level, whose meaning is clear to everyone. Simulation results are presented to investigate the performance of the procedure in situations with gradual departure from the homoscedastic model and its robustness with respect to elliptical departures from normality in each mixture component. Finally, the advantages of the procedure are illustrated via applications to some well-known data sets.
引用
收藏
页码:2797 / 2818
页数:22
相关论文
共 50 条
  • [1] Hypothesis testing in finite mixture of regressions: Sparsity and model selection uncertainty
    Khalili, Abbas
    Vidyashankar, Anand N.
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2018, 46 (03): : 429 - 457
  • [2] Hypothesis testing: a model selection approach
    Cubedo, M
    Oller, JM
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2002, 108 (1-2) : 3 - 21
  • [3] Hypothesis Testing in a Mixture Case-Control Model
    Qin, Jing
    Liang, Kung-Yee
    [J]. BIOMETRICS, 2011, 67 (01) : 182 - 193
  • [4] Comparison of hypothesis testing and Bayesian model selection
    Hoijtink, Herbert
    Klugkist, Irene
    [J]. QUALITY & QUANTITY, 2007, 41 (01) : 73 - 91
  • [5] The GIC for model selection: a hypothesis testing approach
    Shao, J
    Rao, JS
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2000, 88 (02) : 215 - 231
  • [6] Comparison of Hypothesis Testing and Bayesian Model Selection
    Herbert Hoijtink
    Irene Klugkist
    [J]. Quality & Quantity, 2007, 41 : 73 - 91
  • [7] Statistical model selection: An alternative to null hypothesis testing
    Franklin, AB
    Shenk, TM
    Anderson, DR
    Burnham, KP
    [J]. MODELING IN NATURAL RESOURCE MANAGEMENT: DEVELOPMENT INTERPRETATION AND APPLICATION, 2001, : 75 - 90
  • [8] Order-restricted hypothesis testing in a variation of the normal mixture model
    Nettleton, D
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1999, 27 (02): : 383 - 394
  • [9] Model selection, hypothesis testing, and risks of condemning analytical tools
    Steidl, Robert J.
    [J]. JOURNAL OF WILDLIFE MANAGEMENT, 2006, 70 (06): : 1497 - 1498
  • [10] Hypothesis testing in mixture regression models
    Zhu, HT
    Zhang, HP
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2004, 66 : 3 - 16