Hypothesis Testing for Mixture Model Selection

被引:18
|
作者
Punzo, Antonio [1 ]
Browne, Ryan P. [2 ]
McNicholas, Paul D. [2 ]
机构
[1] Univ Catania, Dept Econ & Business, Catania, Italy
[2] McMaster Univ, Dept Math & Stat, Hamilton, ON, Canada
关键词
Closed testing procedures; eigen decomposition; Gaussian mixtures; homoscedasticity; likelihood-ratio tests; LIKELIHOOD RATIO TEST; EM-ALGORITHM; DISCRIMINANT-ANALYSIS; MAXIMUM-LIKELIHOOD; IDENTIFIABILITY; COMPONENTS; DENSITIES; VALUES;
D O I
10.1080/00949655.2015.1131282
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Gaussian mixture models with eigen-decomposed covariance structures, i.e. the Gaussian parsimonious clustering models (GPCM), make up the most popular family of mixture models for clustering and classification. Although the GPCM family has been used for almost 20 years, selecting the best member of the family in a given situation remains a troublesome problem. Likelihood ratio (LR) tests are developed to tackle this problem; given a number of mixture components, these LR tests compare each member of the family to the heteroscedastic model under the alternative hypothesis. Along the way, a novel maximum likelihood estimation procedure is developed for two members of the GPCM family. Simulations show that the inline-graphic xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gscs_a_1131282_ilm0001.gif"inline-graphic reference distribution provides a reasonable approximation for the LR statistics when the sample size is not too small and when the mixture components are separate enough; accordingly, in the remaining configurations, a parametric bootstrap approach is also discussed and evaluated. Furthermore, a closed testing procedure, having the defined LR tests as local tests, is considered to assess, in a straightforward way, a unique model in the general family. In contrast with the information criteria that are often employed in the literature as black boxes', it is only based on one subjective element, the significance level, whose meaning is clear to everyone. Simulation results are presented to investigate the performance of the procedure in situations with gradual departure from the homoscedastic model and its robustness with respect to elliptical departures from normality in each mixture component. Finally, the advantages of the procedure are illustrated via applications to some well-known data sets.
引用
收藏
页码:2797 / 2818
页数:22
相关论文
共 50 条
  • [21] Model selection versus traditional hypothesis testing in circular statistics: a simulation study
    Landler, Lukas
    Ruxton, Graeme D.
    Malkemper, E. Pascal
    [J]. BIOLOGY OPEN, 2020, 9 (06):
  • [22] Selection of a time-varying Volterra model using multiple hypothesis testing
    Green, M
    Zoubir, AM
    [J]. CONFERENCE RECORD OF THE THIRTY-FOURTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2000, : 1782 - 1785
  • [23] Application of two distributions mixture to testing hypothesis in auditing
    Sitek, Grzegorz
    [J]. MATHEMATICAL METHODS IN ECONOMICS (MME 2018), 2018, : 482 - 486
  • [24] SEXUAL SELECTION IN CRABS - TESTING THE NULL HYPOTHESIS
    BOTTON, ML
    LOVELAND, RE
    [J]. AMERICAN ZOOLOGIST, 1988, 28 (04): : A52 - A52
  • [25] Randomized Sensor Selection in Sequential Hypothesis Testing
    Srivastava, Vaibhav
    Plarre, Kurt
    Bullo, Francesco
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2011, 59 (05) : 2342 - 2354
  • [26] Adaptive Sensor Selection in Sequential Hypothesis Testing
    Srivastava, Vaibhav
    Plarre, Kurt
    Bullo, Francesco
    [J]. 2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 6284 - 6289
  • [27] Application of Multiple Hypothesis Testing for Beam Selection
    Kadur, Tobias
    Rave, Wolfgang
    Fettweis, Gerhard
    [J]. ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [28] Species residency status affects model selection and hypothesis testing in freshwater community ecology
    Bried, Jason T.
    Siepielski, Adam M.
    Dvorett, Daniel
    Jog, Suneeti K.
    Patten, Michael A.
    Feng, Xiao
    Davis, Craig A.
    [J]. FRESHWATER BIOLOGY, 2016, 61 (09) : 1568 - 1579
  • [29] Model selection in time series analysis: using information criteria as an alternative to hypothesis testing
    Hacker, R. Scott
    Hatemi-J, Abdulnasser
    [J]. JOURNAL OF ECONOMIC STUDIES, 2022, 49 (06) : 1055 - 1075
  • [30] Model Selection and Hypothesis Testing for Large-Scale Network Models with Overlapping Groups
    Peixoto, Tiago P.
    [J]. PHYSICAL REVIEW X, 2015, 5 (01):