Clustering of High-Dimensional Data via Finite Mixture Models

被引:1
|
作者
McLachlan, Geoff J. [1 ,2 ]
Baek, Jangsun
机构
[1] Univ Queensland, Dept Math, Brisbane, Qld 4072, Australia
[2] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
基金
澳大利亚研究理事会;
关键词
Common factor analyzers; Mixtures of factor analyzers; Model-based clustering; Normal mixture densities; MAXIMUM-LIKELIHOOD;
D O I
10.1007/978-3-642-01044-6_3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finite mixture models are being commonly used in a wide range of applications in practice concerning density estimation and clustering. An attractive feature of this approach to clustering is that it provides a sound statistical framework in which to assess the important question of how many clusters there are in the data and their validity. We review the application of normal mixture models to high-dimensional data of a continuous nature. One way to handle the fitting of normal mixture models is to adopt mixtures of factor analyzers. They enable model-based density estimation and clustering to be undertaken for high-dimensional data, where the number of observations n is not very large relative to their dimension p. In practice, there is often the need to reduce further the number of parameters in the specification of the component-covariance matrices. We focus here on a new modified approach that uses common component-factor loadings, which considerably reduces further the number of parameters. Moreover, it allows the data to be displayed in low-dimensional plots.
引用
收藏
页码:33 / +
页数:3
相关论文
共 50 条
  • [41] High Dimensional Data Clustering by means of Distributed Dirichlet Process Mixture Models
    Meguelati, Khadidja
    Fontez, Benedicte
    Hilgert, Nadine
    Masseglia, Florent
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 890 - 899
  • [42] A Robust High-Dimensional Estimation of Multinomial Mixture Models
    Sabbaghi, Azam
    Eskandari, Farzad
    Navabpoor, Hamid Reza
    [J]. JOURNAL OF STATISTICAL THEORY AND APPLICATIONS, 2021, 20 (01): : 21 - 32
  • [43] A Robust High-Dimensional Estimation of Multinomial Mixture Models
    Azam Sabbaghi
    Farzad Eskandari
    Hamid Reza Navabpoor
    [J]. Journal of Statistical Theory and Applications, 2021, 20 : 21 - 32
  • [44] Model-based clustering of high-dimensional data streams with online mixture of probabilistic PCA
    Anastasios Bellas
    Charles Bouveyron
    Marie Cottrell
    Jérôme Lacaille
    [J]. Advances in Data Analysis and Classification, 2013, 7 : 281 - 300
  • [45] Model-based clustering of high-dimensional data streams with online mixture of probabilistic PCA
    Bellas, Anastasios
    Bouveyron, Charles
    Cottrell, Marie
    Lacaille, Jerome
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2013, 7 (03) : 281 - 300
  • [46] Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space
    Khalili, Abbas
    Chen, Jiahua
    Lin, Shili
    [J]. BIOSTATISTICS, 2011, 12 (01) : 156 - 172
  • [47] Gradient-Based Training of Gaussian Mixture Models for High-Dimensional Streaming Data
    Alexander Gepperth
    Benedikt Pfülb
    [J]. Neural Processing Letters, 2021, 53 : 4331 - 4348
  • [48] Gradient-Based Training of Gaussian Mixture Models for High-Dimensional Streaming Data
    Gepperth, Alexander
    Pfulb, Benedikt
    [J]. NEURAL PROCESSING LETTERS, 2021, 53 (06) : 4331 - 4348
  • [49] An efficient clustering method of data mining for high-dimensional data
    Chang, JW
    Kang, HM
    [J]. 8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 273 - 278
  • [50] High-dimensional clustering method for high performance data mining
    Chang, Jae-Woo
    Lee, Hyun-Jo
    [J]. COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS, 2007, 4489 : 621 - +