clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data

被引:4
|
作者
Zhou, Junyi [1 ]
Zhang, Ying [2 ]
Tu, Wanzhu [1 ]
机构
[1] Indiana Univ, Dept Biostat & Hlth Data Sci, Indianapolis, IN USA
[2] Univ Nebraska Med Ctr, Dept Biostat, Omaha, NE 68198 USA
基金
美国国家卫生研究院;
关键词
B-splines; Dissimilarity metric; Functional data; Longitudinal data; Multiple outcomes; HUNTINGTONS-DISEASE; MODEL; GENE; TIME; PREDICTION; AGREEMENT; DIAGNOSIS; PROFILES; NUMBER; KML;
D O I
10.1080/10618600.2022.2149540
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Longitudinal data clustering is challenging because the grouping has to account for the similarity of individual trajectories in the presence of sparse and irregular times of observation. This paper puts forward a hierarchical agglomerative clustering method based on a dissimilarity metric that quantifies the cost of merging two distinct groups of curves, which are depicted by B-splines for the repeatedly measured data. Extensive simulations show that the proposed method has superior performance in determining the number of clusters, classifying individuals into the correct clusters, and in computational efficiency. Importantly, the method is not only suitable for clustering multivariate longitudinal data with sparse and irregular measurements but also for intensely measured functional data. Towards this end, we provide an R package for the implementation of such analyses. To illustrate the use of the proposed clustering method, two large clinical data sets from real-world clinical studies are analyzed.
引用
收藏
页码:1131 / 1144
页数:14
相关论文
共 50 条
  • [1] A hierarchical clustering method for multivariate geostatistical data
    Fouedjio, Francky
    [J]. SPATIAL STATISTICS, 2016, 18 : 333 - 351
  • [2] A clustering algorithm for multivariate longitudinal data
    Bruckers, Liesbeth
    Molenberghs, Geert
    Drinkenburg, Pim
    Geys, Helena
    [J]. JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2016, 26 (04) : 725 - 741
  • [3] CLUSTERING FOR MULTIVARIATE CONTINUOUS AND DISCRETE LONGITUDINAL DATA
    Komarek, Arnost
    Komarkova, Lenka
    [J]. ANNALS OF APPLIED STATISTICS, 2013, 7 (01): : 177 - 200
  • [4] Bayesian consensus clustering for multivariate longitudinal data
    Lu, Zihang
    Lou, Wendy
    [J]. STATISTICS IN MEDICINE, 2022, 41 (01) : 108 - 127
  • [5] A Hierarchical Model for Time Dependent Multivariate Longitudinal Data
    Alfo, Marco
    Maruotti, Antonello
    [J]. DATA ANALYSIS AND CLASSIFICATION, 2010, : 271 - 279
  • [6] A generalization of functional clustering for discrete multivariate longitudinal data
    Lim, Yaeji
    Cheung, Ying Kuen
    Oh, Hee-Seok
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (11) : 3205 - 3217
  • [7] An Efficient Hybrid Hierarchical Document Clustering Method
    Zhu, Yehang
    Fung, Benjamin C. M.
    Mu, Dejun
    Li, Yanling
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 395 - +
  • [8] Data clustering and analyzing techniques using hierarchical clustering method
    Hu, Wen
    Pan, Qing He
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (19) : 8495 - 8504
  • [9] Data clustering and analyzing techniques using hierarchical clustering method
    Wen Hu
    Qing he Pan
    [J]. Multimedia Tools and Applications, 2015, 74 : 8495 - 8504
  • [10] Hierarchical clustering method based on data fields
    Gan, Wen-Yan
    Li, De-Yi
    Wang, Jian-Min
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2006, 34 (02): : 258 - 262