Mixtures of generalized hyperbolic distributions and mixtures of skew-t distributions for model-based clustering with incomplete-data

被引:18
|
作者
Wei, Yuhong [1 ]
Tang, Yang [1 ]
McNicholas, Paul D. [1 ]
机构
[1] McMaster Univ, Dept Math & Stat, Hamilton, ON, Canada
关键词
Clustering; Generalized hyperbolic; Missing data; Mixture models; Skew-t; HIGH-DIMENSIONAL DATA; FACTOR ANALYZERS; MISSING INFORMATION; EM ALGORITHM; DISCRIMINANT-ANALYSIS; CLASSIFICATION; LIKELIHOOD;
D O I
10.1016/j.csda.2018.08.016
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Robust clustering from incomplete data is an important topic because, in many practical situations, real datasets are heavy-tailed, asymmetric, and/or have arbitrary patterns of missing observations. Flexible methods and algorithms for model-based clustering are presented via mixture of the generalized hyperbolic distributions and its limiting case, the mixture of multivariate skew-t distributions, An analytically feasible EM algorithm is formulated for parameter estimation and imputation of missing values for mixture models employing missing at random mechanisms. The proposed methodologies are investigated through a simulation study with varying proportions of synthetic missing values and illustrated using a real dataset. Comparisons are made with those obtained from the traditional mixture of generalized hyperbolic distribution counterparts by filling in the missing data using the mean imputation method. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:18 / 41
页数:24
相关论文
共 50 条
  • [1] Robust model-based clustering via mixtures of skew-t distributions with missing information
    Wan-Lun Wang
    Tsung-I Lin
    [J]. Advances in Data Analysis and Classification, 2015, 9 : 423 - 445
  • [2] Robust model-based clustering via mixtures of skew-t distributions with missing information
    Wang, Wan-Lun
    Lin, Tsung-I
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2015, 9 (04) : 423 - 445
  • [3] Model-Based Clustering of Non-Gaussian Panel Data Based on Skew-t Distributions
    Juarez, Miguel A.
    Steel, Mark F. J.
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2010, 28 (01) : 52 - 66
  • [4] Model-based clustering of functional data via mixtures of t distributions
    Anton, Cristina
    Smith, Iain
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2024, 18 (03) : 563 - 595
  • [5] SEQUENTIAL DIRICHLET PROCESS MIXTURES OF MULTIVARIATE SKEW t-DISTRIBUTIONS FOR MODEL-BASED CLUSTERING OF FLOW CYTOMETRY DATA
    Hejblum, Boris P.
    Alkhassim, Chariff
    Gottardo, Raphael
    Caron, Frakois
    Thiebaut, Rodolphe
    [J]. ANNALS OF APPLIED STATISTICS, 2019, 13 (01): : 638 - 660
  • [6] Classification Methods for the Serological Status Based on Mixtures of Skew-Normal and Skew-t Distributions
    Dias-Domingues, Tiago
    Mourino, Helena
    Sepulveda, Nuno
    [J]. MATHEMATICS, 2024, 12 (02)
  • [7] The orthogonal skew model: computationally efficient multivariate skew-normal and skew-t distributions with applications to model-based clustering
    Browne, Ryan P.
    Andrews, Jeffrey L.
    [J]. TEST, 2024,
  • [8] An overview of skew distributions in model-based clustering
    Lee, Sharon X.
    McLachlan, Geoffrey J.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 188
  • [9] Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions
    Fruehwirth-Schnatter, Sylvia
    Pyne, Saumyadipta
    [J]. BIOSTATISTICS, 2010, 11 (02) : 317 - 336
  • [10] Dimension reduction for model-based clustering via mixtures of multivariate t-distributions
    Morris, Katherine
    McNicholas, Paul D.
    Scrucca, Luca
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2013, 7 (03) : 321 - 338