Clustering of High-Dimensional Data via Finite Mixture Models

被引:1
|
作者
McLachlan, Geoff J. [1 ,2 ]
Baek, Jangsun
机构
[1] Univ Queensland, Dept Math, Brisbane, Qld 4072, Australia
[2] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
基金
澳大利亚研究理事会;
关键词
Common factor analyzers; Mixtures of factor analyzers; Model-based clustering; Normal mixture densities; MAXIMUM-LIKELIHOOD;
D O I
10.1007/978-3-642-01044-6_3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finite mixture models are being commonly used in a wide range of applications in practice concerning density estimation and clustering. An attractive feature of this approach to clustering is that it provides a sound statistical framework in which to assess the important question of how many clusters there are in the data and their validity. We review the application of normal mixture models to high-dimensional data of a continuous nature. One way to handle the fitting of normal mixture models is to adopt mixtures of factor analyzers. They enable model-based density estimation and clustering to be undertaken for high-dimensional data, where the number of observations n is not very large relative to their dimension p. In practice, there is often the need to reduce further the number of parameters in the specification of the component-covariance matrices. We focus here on a new modified approach that uses common component-factor loadings, which considerably reduces further the number of parameters. Moreover, it allows the data to be displayed in low-dimensional plots.
引用
收藏
页码:33 / +
页数:3
相关论文
共 50 条
  • [1] Model selection and application to high-dimensional count data clustering: via finite EDCM mixture models
    Zamzami, Nuha
    Bouguila, Nizar
    [J]. APPLIED INTELLIGENCE, 2019, 49 (04) : 1467 - 1488
  • [2] Bayesian variable selection in clustering high-dimensional data via a mixture of finite mixtures
    Doo, Woojin
    Kim, Heeyoung
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2021, 91 (12) : 2551 - 2568
  • [3] Clustering high-dimensional data via feature selection
    Liu, Tianqi
    Lu, Yu
    Zhu, Biqing
    Zhao, Hongyu
    [J]. BIOMETRICS, 2023, 79 (02) : 940 - 950
  • [4] Clustering electricity consumers using high-dimensional regression mixture models
    Devijver, Emilie
    Goude, Yannig
    Poggi, Jean-Michel
    [J]. APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2020, 36 (01) : 159 - 177
  • [5] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [6] Clustering High-Dimensional Data
    Masulli, Francesco
    Rovetta, Stefano
    [J]. CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 : 1 - 13
  • [7] An entropy weighting mixture model for subspace clustering of high-dimensional data
    Peng, Liuqing
    Zhang, Junying
    [J]. PATTERN RECOGNITION LETTERS, 2011, 32 (08) : 1154 - 1161
  • [8] Model selection and application to high-dimensional count data clusteringvia finite EDCM mixture models
    Nuha Zamzami
    Nizar Bouguila
    [J]. Applied Intelligence, 2019, 49 : 1467 - 1488
  • [9] Supervised clustering of high-dimensional data using regularized mixture modeling
    Chang, Wennan
    Wan, Changlin
    Zang, Yong
    Zhang, Chi
    Cao, Sha
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [10] Clustering High-Dimensional Data via Random Sampling and Consensus
    Traganitis, Panagiotis A.
    Slavakis, Konstantinos
    Giannakis, Georgios B.
    [J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 307 - 311