A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data

被引:2
|
作者
Ranalli, Monia [1 ]
Rocci, Roberto [2 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Univ Tor Vergata, Rome, Italy
关键词
mixture models; reduction ordinal data; composite likelihood; STRUCTURAL EQUATION MODELS; VARIABLE SELECTION; MIXTURE-MODELS; LIKELIHOOD; EXTENSION; INVARIANCE; ANALYZERS; CRITERIA;
D O I
10.1007/s11336-017-9578-5
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The literature on clustering for continuous data is rich and wide; differently, that one developed for categorical data is still limited. In some cases, the clustering problem is made more difficult by the presence of noise variables/dimensions that do not contain information about the clustering structure and could mask it. The aim of this paper is to propose a model for simultaneous clustering and dimensionality reduction of ordered categorical data able to detect the discriminative dimensions discarding the noise ones. Following the underlying response variable approach, the observed variables are considered as a discretization of underlying first-order latent continuous variables distributed as a Gaussian mixture. To recognize discriminative and noise dimensions, these variables are considered to be linear combinations of two independent sets of second-order latent variables where only one contains the information about the cluster structure while the other one contains noise dimensions. The model specification involves multidimensional integrals that make the maximum likelihood estimation cumbersome and in some cases infeasible. To overcome this issue, the parameter estimation is carried out through an EM-like algorithm maximizing a composite log-likelihood based on low-dimensional margins. Examples of application of the proposal on real and simulated data are performed to show the effectiveness of the proposal.
引用
收藏
页码:1007 / 1034
页数:28
相关论文
共 50 条
  • [1] A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data
    Monia Ranalli
    Roberto Rocci
    [J]. Psychometrika, 2017, 82 : 1007 - 1034
  • [2] Bayesian model-based clustering for longitudinal ordinal data
    Roy Costilla
    Ivy Liu
    Richard Arnold
    Daniel Fernández
    [J]. Computational Statistics, 2019, 34 : 1015 - 1038
  • [3] Model-based co-clustering for ordinal data
    Jacques, Julien
    Biernacki, Christophe
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 123 : 101 - 115
  • [4] Bayesian model-based clustering for longitudinal ordinal data
    Costilla, Roy
    Liu, Ivy
    Arnold, Richard
    Fernandez, Daniel
    [J]. COMPUTATIONAL STATISTICS, 2019, 34 (03) : 1015 - 1038
  • [5] Cloud Model-based Data Attributes Reduction for Clustering
    Xu Ru-zhi
    Nie Pei-yao
    Lin Pei-guang
    Chu Dong-sheng
    [J]. PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 33 - 36
  • [6] A Hierarchical Model-based Approach to Co-Clustering High-Dimensional Data
    Costa, Gianni
    Manco, Giuseppe
    Ortale, Riccardo
    [J]. APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 886 - 890
  • [7] Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm
    Biernacki, Christophe
    Jacques, Julien
    [J]. STATISTICS AND COMPUTING, 2016, 26 (05) : 929 - 943
  • [8] Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm
    Christophe Biernacki
    Julien Jacques
    [J]. Statistics and Computing, 2016, 26 : 929 - 943
  • [9] Model-based clustering of high-dimensional data: A review
    Bouveyron, Charles
    Brunet-Saumard, Camille
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 52 - 78
  • [10] MODEL-BASED CLUSTERING OF HIGH-DIMENSIONAL DATA IN ASTROPHYSICS
    Bouveyron, C.
    [J]. STATISTICS FOR ASTROPHYSICS: CLUSTERING AND CLASSIFICATION, 2016, 77 : 91 - 119