Overrating Bayesian mixtures of factor analyzers with an unknown number of components

被引:8
|
作者
Papastamoulis, Panagiotis [1 ]
机构
[1] Univ Manchester, Fac Biol Med & Hlth, Div Informat Imaging & Data Sci, Michael Smith Bldg,Oxford Rd, Manchester M13 9PL, Lancs, England
关键词
Factor analysis; Mixture models; Clustering; MCMC; CHAIN MONTE-CARLO; LABEL SWITCHING PROBLEM; MAXIMUM-LIKELIHOOD; R PACKAGE; MODEL; DISTRIBUTIONS; DEVIANCE; CRITERIA; MCMC;
D O I
10.1016/j.csda.2018.03.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recent advances on overfitting Bayesian mixture models provide a solid and straightforward approach for inferring the underlying number of clusters and model parameters in heterogeneous datasets. The applicability of such a framework in clustering correlated high dimensional data is demonstrated. For this purpose an overfitting mixture of factor analyzers is introduced, assuming that the number of factors is fixed. A Markov chain Monte Carlo (MCMC) sampler combined with a prior parallel tempering scheme is used to estimate the posterior distribution of model parameters. The optimal number of factors is estimated using information criteria. Identifiability issues related to the label switching problem are dealt by post-processing the simulated MCMC sample by relabeling algorithms. The method is benchmarked against state-of-the-art software for maximum likelihood estimation of mixtures of factor analyzers using an extensive simulation study. Finally, the applicability of the method is illustrated in publicly available data. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:220 / 234
页数:15
相关论文
共 50 条
  • [31] Mitigating Outliers for Bayesian Mixture of Factor Analyzers
    Chen, Zhongtao
    Cheng, Lei
    [J]. 2020 IEEE 11TH SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP (SAM), 2020,
  • [32] Maximum likelihood estimation of mixtures of factor analyzers
    Montanari, Angela
    Viroli, Cinzia
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (09) : 2712 - 2723
  • [33] Mixtures of Hidden Truncation Hyperbolic Factor Analyzers
    Paula M. Murray
    Ryan P. Browne
    Paul D. McNicholas
    [J]. Journal of Classification, 2020, 37 : 366 - 379
  • [34] Voice Conversion Based on Mixtures of Factor Analyzers
    Uto, Yosuke
    Nankaku, Yoshihiko
    Toda, Tomoki
    Lee, Akinobu
    Tokuda, Keiichi
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2278 - +
  • [35] Mixtures of skew-t factor analyzers
    Murray, Paula M.
    Browne, Ryan P.
    McNicholas, Paul D.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 77 : 326 - 335
  • [36] Mixtures of Hidden Truncation Hyperbolic Factor Analyzers
    Murray, Paula M.
    Browne, Ryan P.
    McNicholas, Paul D.
    [J]. JOURNAL OF CLASSIFICATION, 2020, 37 (02) : 366 - 379
  • [37] (SEMI-) SUPERVISED MIXTURES OF FACTOR ANALYZERS AND DEEP MIXTURES OF FACTOR ANALYZERS DIMENSIONALITY REDUCTION ALGORITHMS FOR HYPERSPECTRAL IMAGES CLASSIFICATION
    Zhao, Bin
    Sveinsson, Johannes R.
    Ulfarsson, Magnus O.
    Chanussot, Jocelyn
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 887 - 890
  • [38] Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions
    Lee, Sharon X.
    Lin, Tsung-, I
    McLachlan, Geoffrey J.
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2021, 15 (02) : 481 - 512
  • [39] Mixtures of factor analyzers with scale mixtures of fundamental skew normal distributions
    Sharon X. Lee
    Tsung-I Lin
    Geoffrey J. McLachlan
    [J]. Advances in Data Analysis and Classification, 2021, 15 : 481 - 512
  • [40] RECONSTRUCTION OF MASS-SPECTRA OF COMPONENTS OF UNKNOWN MIXTURES BASED ON FACTOR-ANALYSIS
    CHEN, JH
    HWANG, LP
    [J]. ANALYTICA CHIMICA ACTA-COMPUTER TECHNIQUES AND OPTIMIZATION, 1981, 5 (03): : 271 - 281