LongDat: an R package for covariate-sensitive longitudinal analysis of high-dimensional data

被引:2
|
作者
Chen, Chia-Yu [1 ,2 ,3 ,4 ,5 ,6 ,7 ]
Loeber, Ulrike [1 ,2 ,3 ,4 ,5 ,6 ,7 ]
Forslund, Sofia K. [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ]
机构
[1] Max Delbruck Ctr Mol Med Helmholtz Assoc MDC, D-13125 Berlin, Germany
[2] Max Delbruck Ctr Mol Med Helmholtz Assoc, Expt & Clin Res Ctr, D-13125 Berlin, Germany
[3] Charite Univ Med Berlin, D-13125 Berlin, Germany
[4] Charite Univ Med Berlin, D-10117 Berlin, Germany
[5] Free Univ Berlin, D-10117 Berlin, Germany
[6] Humboldt Univ, D-10117 Berlin, Germany
[7] DZHK German Ctr Cardiovasc Res, Partner Site Berlin, D-10785 Berlin, Germany
[8] European Mol Biol Lab, Struct & Computat Biol Unit, D-69117 Heidelberg, Germany
来源
BIOINFORMATICS ADVANCES | 2023年 / 3卷 / 01期
关键词
REGRESSION; MODEL;
D O I
10.1093/bioadv/vbad063
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We introduce LongDat, an R package that analyzes longitudinal multivariable (cohort) data while simultaneously accounting for a potentially large number of covariates. The primary use case is to differentiate direct from indirect effects of an intervention (or treatment) and to identify covariates (potential mechanistic intermediates) in longitudinal data. LongDat focuses on analyzing longitudinal microbiome data, but its usage can be expanded to other data types, such as binary, categorical and continuous data. We tested and compared LongDat with other tools (i.e. MaAsLin2, ANCOM, lgpr and ZIBR) on both simulated and real data. We showed that LongDat outperformed these tools in accuracy, runtime and memory cost, especially when there were multiple covariates. The results indicate that the LongDat R package is a computationally efficient and low-memory-cost tool for longitudinal data with multiple covariates and facilitates robust biomarker searches in high-dimensional datasets.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] PGEE: An R Package for Analysis of Longitudinal Data with High-Dimensional Covariates
    Inan, Gul
    Wang, Lan
    [J]. R JOURNAL, 2017, 9 (01): : 393 - 402
  • [2] Interep: An R Package for High-Dimensional Interaction Analysis of the Repeated Measurement Data
    Zhou, Fei
    Ren, Jie
    Liu, Yuwen
    Li, Xiaoxi
    Wang, Weiqun
    Wu, Cen
    [J]. GENES, 2022, 13 (03)
  • [3] Springer: An R package for bi-level variable selection of high-dimensional longitudinal data
    Zhou, Fei
    Liu, Yuwen
    Ren, Jie
    Wang, Weiqun
    Wu, Cen
    [J]. FRONTIERS IN GENETICS, 2023, 14
  • [4] ordinalgmifs: An R Package for Ordinal Regression in High-dimensional Data Settings
    Archer, Kellie J.
    Hou, Jiayi
    Zhou, Qing
    Ferber, Kyle
    Layne, John G.
    Gentry, Amanda E.
    [J]. CANCER INFORMATICS, 2014, 13 : 187 - 195
  • [5] HDclassif: An R Package for Model-Based Clustering and Discriminant Analysis of High-Dimensional Data
    Berge, Laurent
    Bouveyron, Charles
    Girard, Stephane
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2012, 46 (06): : 1 - 29
  • [6] Ultra high-dimensional semiparametric longitudinal data analysis
    Green, Brittany
    Lian, Heng
    Yu, Yan
    Zu, Tianhai
    [J]. BIOMETRICS, 2021, 77 (03) : 903 - 913
  • [7] SIMEXBoost: An R package for Analysis of High-Dimensional Error-Prone Data Based on Boosting Method
    Chen, Li-Pang
    Qiu, Bangxu
    [J]. R JOURNAL, 2023, 15 (04): : 5 - 20
  • [8] Lagged principal trend analysis for longitudinal high-dimensional data
    Zhang, Yuping
    [J]. STAT, 2019, 8 (01):
  • [9] Joint principal trend analysis for longitudinal high-dimensional data
    Zhang, Yuping
    Ouyang, Zhengqing
    [J]. BIOMETRICS, 2018, 74 (02) : 430 - 438
  • [10] Logistic regression error-in-covariate models for longitudinal high-dimensional covariates
    Park, Hyung
    Lee, Seonjoo
    [J]. STAT, 2019, 8 (01):