BAYESIAN MULTIVARIATE SPARSE FUNCTIONAL PRINCIPAL COMPONENTS ANALYSIS WITH APPLICATION TO LONGITUDINAL MICROBIOME MULTIOMICS DATA

被引:3
|
作者
Jiang, Lingjing [1 ]
Elrod, Chris [2 ]
Kim, Jane J. [3 ]
Swafford, Austin D. [4 ]
Knight, Rob [5 ]
Thompson, Wesley K. [1 ]
机构
[1] Univ Calif San Diego, Herbert Wertheim Sch Publ Hlth & Human Longev Sci, La Jolla, CA 92093 USA
[2] Julia Comp, Boston, MA USA
[3] Univ Calif San Diego, Dept Pediat, La Jolla, CA 92093 USA
[4] Univ Calif San Diego, Ctr Microbiome Innovat, La Jolla, CA 92093 USA
[5] Univ Calif San Diego, Dept Pediat, Ctr Microbiome Innovat, Dept Comp Sci & Engn,Dept Bioengn, La Jolla, CA 92093 USA
来源
ANNALS OF APPLIED STATISTICS | 2022年 / 16卷 / 04期
关键词
1; Introduction; Numerous disorders; including heritable immune -mediated diseases; Key words and phrases; Bayesian; functional data analysis; longitudinal; microbiome; multiomics; GUT MICROBIOME; INFECTION; RATES; OMICS;
D O I
10.1214/21-AOAS1587
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Microbiome researchers often need to model the temporal dynamics of multiple complex, nonlinear outcome trajectories simultaneously. This motivates our development of multivariate Sparse Functional Principal Components Analysis (mSFPCA), extending existing SFPCA methods to simultaneously characterize multiple temporal trajectories and their interrelationships. As with existing SFPCA methods, the mSFPCA algorithm characterizes each trajectory as a smooth mean plus a weighted combination of the smooth major modes of variation about the mean, where the weights are given by the component scores for each subject. Unlike existing SFPCA methods, the mSFPCA algorithm allows estimation of multiple trajectories simultaneously, such that the component scores, which are constrained to be independent within a particular outcome for identifiability, may be arbitrarily correlated with component scores for other outcomes. A Cholesky decomposition is used to estimate the component score covariance matrix efficiently and guarantee positive semidefiniteness given these constraints. Mutual information is used to assess the strength of marginal and conditional temporal associations across outcome trajectories. Importantly, we implement mSFPCA as a Bayesian algorithm using R and stan, enabling easy use of packages such as PSIS-LOO for model selection and graphical posterior predictive checks to assess the validity of mSFPCA models. Although we focus on application of mSFPCA to microbiome data in this paper, the mSFPCA model is of general utility and can be used in a wide range of real-world applications.
引用
收藏
页码:2231 / 2249
页数:19
相关论文
共 50 条
  • [21] Robust functional principal components for irregularly spaced longitudinal data
    Maronna, Ricardo A.
    [J]. STATISTICAL PAPERS, 2021, 62 (04) : 1563 - 1582
  • [22] Joint modelling of paired sparse functional data using principal components
    Zhou, Lan
    Huang, Jianhua Z.
    Carroll, Raymond J.
    [J]. BIOMETRIKA, 2008, 95 (03) : 601 - 619
  • [23] Hybrid principal components analysis for region-referenced longitudinal functional EEG data
    Scheffler, Aaron
    Telesca, Donatello
    Li, Qian
    Sugar, Catherine A.
    Distefano, Charlotte
    Jeste, Shafali
    Senturk, Damla
    [J]. BIOSTATISTICS, 2020, 21 (01) : 139 - 157
  • [24] Functional principal components analysis with survey data
    Cardot, Herve
    Chaouch, Mohamed
    Goga, Camelia
    Labruere, Catherine
    [J]. FUNCTIONAL AND OPERATORIAL STATISTICS, 2008, : 95 - 102
  • [25] A Bayesian nonparametric analysis for zero-inflated multivariate count data with application to microbiome study
    Shuler, Kurtis
    Verbanic, Samuel
    Chen, Irene A.
    Lee, Juhee
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2021, 70 (04) : 961 - 979
  • [26] Functional principal component models for sparse and irregularly spaced data by Bayesian inference
    Ye, Jun
    [J]. JOURNAL OF APPLIED STATISTICS, 2024, 51 (07) : 1287 - 1317
  • [27] EXPLORATORY ANALYSIS OF MULTIVARIATE DATA USING THE MINOR PRINCIPAL COMPONENTS
    FATTI, LP
    HAWKINS, DM
    [J]. SOUTH AFRICAN STATISTICAL JOURNAL, 1983, 17 (02) : 180 - 180
  • [28] MULTIVARIATE DATA REDUCTION BY PRINCIPAL COMPONENTS, WITH APPLICATION TO NEUROLOGICAL SCORING INSTRUMENTS
    KOZIOL, JA
    HACKE, W
    [J]. JOURNAL OF NEUROLOGY, 1990, 237 (08) : 461 - 464
  • [29] Bayesian analysis of longitudinal and multidimensional functional data
    Shamshoian, John
    Senturk, Damla
    Jeste, Shafali
    Telesca, Donatello
    [J]. BIOSTATISTICS, 2022, 23 (02) : 558 - 573
  • [30] Bayesian Modeling on Microbiome Data Analysis: Application to Subgingival Microbiome Study
    Gwon, Yeongjin
    Yu, Fang
    Payne, Jeffrey B.
    Mikuls, Ted R.
    [J]. STATISTICS IN BIOSCIENCES, 2023,