Penalized model-based clustering of complex functional data

被引:2
|
作者
Pronello, Nicola [1 ]
Ignaccolo, Rosaria [2 ]
Ippoliti, Luigi [3 ]
Fontanella, Sara [4 ]
机构
[1] Univ G dAnnunzio, Dept Neurosci Imaging & Clin Sci, Pescara, Italy
[2] Univ Turin, Dept Econ & Stat Cognetti de Martiis, Turin, Italy
[3] Univ G dAnnunzio, Dept Econ, Pescara, Italy
[4] Imperial Coll London, Natl Heart & Lung Inst, London, England
关键词
Functional zoning; Manifold data; Mixture models; Shape analysis; Spatial clustering; Surface data; CLASSIFICATION; REGRESSION; DIFFUSION; SPLINES; CURVES;
D O I
10.1007/s11222-023-10288-2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
High dimensional data, large-scale data, imaging and manifold data are all fostering new frontiers of statistics. These type of data are commonly considered in Functional Data Analysis where they are viewed as infinite-dimensional random vectors in a functional space. The rapid development of new technologies has generated a flow of complex data that have led to the development of new modeling strategies by scientists. In this paper, we basically deal with the problem of clustering a set of complex functional data into homogeneous groups. Working in a mixture model-based framework, we develop a flexible clustering technique achieving dimensionality reduction schemes through an L1 penalization. The proposed procedure results in an integrated modelling approach where shrinkage techniques are applied to enable sparse solutions in both the means and the covariance matrices of the mixture components, while preserving the underlying clustering structure. This leads to an entirely data-driven methodology suitable for simultaneous dimensionality reduction and clustering. The proposed methodology is evaluated through a Monte Carlo simulation study and an empirical analysis of real-world datasets showing different degrees of complexity.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Model-based clustering with missing not at random data
    Sportisse, Aude
    Marbac, Matthieu
    Laporte, Fabien
    Celeux, Gilles
    Boyer, Claire
    Josse, Julie
    Biernacki, Christophe
    [J]. STATISTICS AND COMPUTING, 2024, 34 (04)
  • [22] On model-based clustering of skewed matrix data
    Melnykov, Volodymyr
    Zhu, Xuwen
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 167 : 181 - 194
  • [23] Model-based Clustering and Classification for Data Science
    Unwin, Antony
    [J]. INTERNATIONAL STATISTICAL REVIEW, 2020, 88 (01) : 263 - 264
  • [24] Model-based clustering of array CGH data
    Shah, Sohrab P.
    Cheung, K-John, Jr.
    Johnson, Nathalie A.
    Alain, Guillaume
    Gascoyne, Randy D.
    Horsman, Douglas E.
    Ng, Raymond T.
    Murphy, Kevin P.
    [J]. BIOINFORMATICS, 2009, 25 (12) : I30 - I38
  • [25] Model-based multidimensional clustering of categorical data
    Chen, Tao
    Zhang, Nevin L.
    Liu, Tengfei
    Poon, Kin Man
    Wang, Yi
    [J]. ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2246 - 2269
  • [26] Model-Based Hierarchical Clustering for Categorical Data
    Alalyan, Fahdah
    Zamzami, Nuha
    Bouguila, Nizar
    [J]. 2019 IEEE 28TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2019, : 1424 - 1429
  • [27] Penalized regression with model-based penalties
    Heckman, NE
    Ramsay, JO
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2000, 28 (02): : 241 - 258
  • [28] Model-based clustering and data transformations for gene expression data
    Yeung, KY
    Fraley, C
    Murua, A
    Raftery, AE
    Ruzzo, WL
    [J]. BIOINFORMATICS, 2001, 17 (10) : 977 - 987
  • [29] Model-based clustering for RNA-seq data
    Si, Yaqing
    Liu, Peng
    Li, Pinghua
    Brutnell, Thomas P.
    [J]. BIOINFORMATICS, 2014, 30 (02) : 197 - 205
  • [30] Model-Based Clustering for Conditionally Correlated Categorical Data
    Marbac, Matthieu
    Biernacki, Christophe
    Vandewalle, Vincent
    [J]. JOURNAL OF CLASSIFICATION, 2015, 32 (02) : 145 - 175