Robust clustering via mixtures of t factor analyzers with incomplete data

被引:6
|
作者
Wang, Wan-Lun [1 ]
Lin, Tsung-, I [2 ,3 ]
机构
[1] Feng Chia Univ, Grad Inst Stat & Actuarial Sci, Dept Stat, Taichung 40724, Taiwan
[2] Natl Chung Hsing Univ, Inst Stat, Taichung 402, Taiwan
[3] China Med Univ, Dept Publ Hlth, Taichung 404, Taiwan
关键词
Data reduction; Factor analyzer; Information matrix; Mixture models; Multivariate t distribution; Missing data; MAXIMUM-LIKELIHOOD-ESTIMATION; MULTIVARIATE NORMAL-DISTRIBUTION; ECM ALGORITHM; ML ESTIMATION; MODELS; INFERENCE;
D O I
10.1007/s11634-021-00453-8
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Mixtures of t factor analyzers (MtFA) are powerful and widely used tools for robust clustering of high-dimensional data in the presence of outliers. However, the occurrence of missing values may cause analytical intractability and computational complexity when fitting the MtFA model. We explicitly derive the score vector and Hessian matrix of the MtFA model with incomplete data to approximate the information matrix. In this regard, some asymptotic properties can be established under certain regularity conditions. Three expectation-maximization-based algorithms are developed for maximum likelihood estimation of the MtFA model with possibly missing values at random. Practical issues related to the recovery of missing values and clustering of partially observed samples are also investigated. The relevant utility of our methodology is exemplified through the analysis of simulated and real data sets.
引用
收藏
页码:659 / 690
页数:32
相关论文
共 50 条
  • [41] Dimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzers
    Viroli, Cinzia
    [J]. JOURNAL OF CLASSIFICATION, 2010, 27 (03) : 363 - 388
  • [42] MIXTURES OF FACTOR ANALYZERS AND DEEP MIXTURES OF FACTOR ANALYZERS DIMENSIONALITY REDUCTION ALGORITHMS FOR HYPERSPECTRAL IMAGES CLASSIFICATION
    Zhao, Bin
    Ulfarsson, Magnus O.
    Sveinsson, Johannes R.
    Chanussot, Jocelyn
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 891 - 894
  • [43] ROBUST CLUSTERING OF DATA COLLECTED VIA CROWDSOURCING
    Pages-Zamora, Alba
    Giannakis, Georgios B.
    Lopez-Valcarce, Roberto
    Gimenez-Febrer, Pere
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4014 - 4018
  • [44] Mixtures of robust probabilistic principal component analyzers
    Archambeau, Cedric
    Delannay, Nicolas
    Verleysen, Michel
    [J]. NEUROCOMPUTING, 2008, 71 (7-9) : 1274 - 1282
  • [45] A Robust Fuzzy c-Means Clustering Algorithm for Incomplete Data
    Li, Jinhua
    Song, Shiji
    Zhang, Yuli
    Li, Kang
    [J]. INTELLIGENT COMPUTING, NETWORKED CONTROL, AND THEIR ENGINEERING APPLICATIONS, PT II, 2017, 762 : 3 - 12
  • [46] A robust self-organizing approach to effectively clustering incomplete data
    Vo Thi Ngoc Chau
    [J]. 2015 SEVENTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2015, : 150 - 155
  • [47] Mixtures of factor analyzers:: an extension with covariates
    Fokoué, E
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2005, 95 (02) : 370 - 384
  • [48] Bayesian analysis of mixtures of factor analyzers
    Utsugi, A
    Kumagai, T
    [J]. NEURAL COMPUTATION, 2001, 13 (05) : 993 - 1002
  • [49] Robust model-based clustering via mixtures of skew-t distributions with missing information
    Wan-Lun Wang
    Tsung-I Lin
    [J]. Advances in Data Analysis and Classification, 2015, 9 : 423 - 445
  • [50] Robust model-based clustering via mixtures of skew-t distributions with missing information
    Wang, Wan-Lun
    Lin, Tsung-I
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2015, 9 (04) : 423 - 445