Model-based clustering and analysis of life history data

被引:1
|
作者
Scott, Marc A. [1 ]
Mohan, Kaushik [1 ]
Gauthier, Jacques-Antoine [2 ]
机构
[1] NYU, 3rd Floor,246 Greene St, New York, NY 10003 USA
[2] Univ Lausanne, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
Categorical data; Life course studies; Longitudinal data; Model-based clustering; Sequence analysis; Swiss Household Panel; LATENT CLASS; SEQUENCE-ANALYSIS; MIXTURE-MODELS; TRAJECTORIES; INFERENCE; WORK; TIME;
D O I
10.1111/rssa.12575
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Methods and models for longitudinal data with categorical, multi-dimensional outcomes are quite limited, but they are essential to the study of life histories. For example, in the Swiss Household Panel, information on the co-residence and professional status of several thousand individuals is available through to age 45 years. Interest centres on the time and order of life course events such as having children and working full or part time and the duration of the phases that they delineate. With data of this type, optimal matching and clustering algorithms relying on a distance metric or parametric models of duration in a competing risks framework are used; the appropriateness of each derives from competing goals and orientation. We prefer model-based approaches when certain goals are paramount: simulation of individual trajectories; adjusting for time-dependent covariates; handling multistate trajectories and missing outcomes. Several of these goals are particularly challenging when the number of states is of moderate size, and many transitions are infrequent and/or time inhomogeneous. Using the Swiss Household Panel, we demonstrate the appropriateness of latent class growth curve models for analysing sequence data. In particular, models including heterogeneous dependence structure provide new techniques for assessing goodness of fit as well as yield insights into social processes.
引用
收藏
页码:1231 / 1251
页数:21
相关论文
共 50 条
  • [1] Model-based clustering of longitudinal data
    McNicholas, Paul D.
    Murphy, T. Brendan
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (01): : 153 - 168
  • [2] Boosting for model-based data clustering
    Saffari, Amir
    Bischof, Horst
    [J]. PATTERN RECOGNITION, 2008, 5096 : 51 - 60
  • [3] Model-based clustering for longitudinal data
    De la Cruz-Mesia, Rolando
    Quintanab, Fernando A.
    Marshall, Guillermo
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) : 1441 - 1457
  • [4] Model-Based Clustering of Temporal Data
    El Assaad, Hani
    Same, Allou
    Govaert, Gerard
    Aknin, Patrice
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 9 - 16
  • [5] Model-based clustering with missing not at random data
    Sportisse, Aude
    Marbac, Matthieu
    Laporte, Fabien
    Celeux, Gilles
    Boyer, Claire
    Josse, Julie
    Biernacki, Christophe
    [J]. STATISTICS AND COMPUTING, 2024, 34 (04)
  • [6] Model-based clustering and classification of functional data
    Chamroukhi, Faicel
    Nguyen, Hien D.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 9 (04)
  • [7] On model-based clustering of skewed matrix data
    Melnykov, Volodymyr
    Zhu, Xuwen
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 167 : 181 - 194
  • [8] Model-based Clustering and Classification for Data Science
    Unwin, Antony
    [J]. INTERNATIONAL STATISTICAL REVIEW, 2020, 88 (01) : 263 - 264
  • [9] Model-based clustering of array CGH data
    Shah, Sohrab P.
    Cheung, K-John, Jr.
    Johnson, Nathalie A.
    Alain, Guillaume
    Gascoyne, Randy D.
    Horsman, Douglas E.
    Ng, Raymond T.
    Murphy, Kevin P.
    [J]. BIOINFORMATICS, 2009, 25 (12) : I30 - I38
  • [10] Model-based multidimensional clustering of categorical data
    Chen, Tao
    Zhang, Nevin L.
    Liu, Tengfei
    Poon, Kin Man
    Wang, Yi
    [J]. ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2246 - 2269