Robust, fuzzy, and parsimonious clustering, based on mixtures of Factor Analyzers

被引:3
|
作者
Angel Garcia-Escudero, Luis [1 ,2 ]
Greselin, Francesca [3 ]
Mayo Iscar, Agustin [1 ,2 ]
机构
[1] Univ Valladolid, Dept Stat & Operat Res, Valladolid, Spain
[2] Univ Valladolid, IMUVA, Valladolid, Spain
[3] Milano Bicocca Univ, Dept Stat & Quantitat Methods, Milan, Italy
关键词
Fuzzy clustering; Robust clustering; Unsupervised learning; Factor analysis; Hard contrast; Dimension reduction; Outliers identification; MAXIMUM-LIKELIHOOD-ESTIMATION; ALGORITHMS; NOISE;
D O I
10.1016/j.ijar.2018.01.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A clustering algorithm that combines the advantages of fuzzy clustering and robust statistical estimators is presented. It is based on mixtures of Factor Analyzers, endowed by the joint usage of impartial trimming and constrained estimation of scatter matrices, in a modified maximum likelihood approach. The algorithm generates a set of membership values, that are used to fuzzy partition the data set and to contribute to the robust estimates of the mixture parameters. The adoption of clusters modeled by Gaussian Factor Analysis allows for dimension reduction and for discovering local linear structures in the data. The new methodology has been shown to be resistant to different types of contamination, by applying it on artificial data. A brief discussion on the tuning parameters, such as the trimming level, the fuzzifier parameter, the number of clusters and the value of the scatter matrices constraint, has been developed, also with the help of some heuristic tools for their choice. Finally, a real data set has been analyzed, to show how intermediate membership values are estimated for observations lying at cluster overlap, while cluster cores are composed by observations that are assigned to a cluster in a crisp way. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:60 / 75
页数:16
相关论文
共 50 条
  • [1] Fuzzy Clustering Through Robust Factor Analyzers
    Angel Garcia-Escudero, Luis
    Greselin, Francesca
    Mayo Iscar, Agustin
    [J]. SOFT METHODS FOR DATA SCIENCE, 2017, 456 : 229 - 235
  • [2] Robust clustering via mixtures of t factor analyzers with incomplete data
    Wang, Wan-Lun
    Lin, Tsung-, I
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2022, 16 (03) : 659 - 690
  • [3] Robust clustering via mixtures of t factor analyzers with incomplete data
    Wan-Lun Wang
    Tsung-I Lin
    [J]. Advances in Data Analysis and Classification, 2022, 16 : 659 - 690
  • [4] Robust clustering of multiply censored data via mixtures of t factor analyzers
    Wan-Lun Wang
    Tsung-I Lin
    [J]. TEST, 2022, 31 : 22 - 53
  • [5] Robust clustering of multiply censored data via mixtures of t factor analyzers
    Wang, Wan-Lun
    Lin, Tsung-, I
    [J]. TEST, 2022, 31 (01) : 22 - 53
  • [6] Model-based clustering of censored data via mixtures of factor analyzers
    Wang, Wan-Lun
    Castro, Luis M.
    Lachos, Victor H.
    Lin, Tsung-I
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 140 : 104 - 121
  • [7] Robust inference for parsimonious model-based clustering
    Dotto, Francesco
    Farcomeni, Alessio
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (03) : 414 - 442
  • [8] Dimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzers
    Cinzia Viroli
    [J]. Journal of Classification, 2010, 27 : 363 - 388
  • [9] Dimensionally Reduced Model-Based Clustering Through Mixtures of Factor Mixture Analyzers
    Viroli, Cinzia
    [J]. JOURNAL OF CLASSIFICATION, 2010, 27 (03) : 363 - 388
  • [10] Voice Conversion Based on Mixtures of Factor Analyzers
    Uto, Yosuke
    Nankaku, Yoshihiko
    Toda, Tomoki
    Lee, Akinobu
    Tokuda, Keiichi
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2278 - +