Cluster analysis of microbiome data by using mixtures of Dirichlet-multinomial regression models

被引:9
|
作者
Subedi, Sanjeena [1 ]
Neish, Drew [2 ]
Bak, Stephen [2 ]
Feng, Zeny [2 ]
机构
[1] SUNY Binghamton, Binghamton, NY USA
[2] Univ Guelph, Guelph, ON, Canada
关键词
Cluster analysis; Dirichlet-multinomial regression; Expectation-maximization algorithm; Microbiome data; Mixture models; Operational taxonomic unit; GUT MICROBIOTA; MAXIMUM-LIKELIHOOD; CANCER; HOST;
D O I
10.1111/rssc.12432
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The human gut microbiome is one of the fundamental components of our physiology, and exploring the relationship between biological and environmental covariates and the resulting taxonomic composition of a given microbial community is an active area of research. Previously, a Dirichlet-multinomial regression framework has been suggested to model this relationship, but it did not account for any underlying latent group structure. An underlying group structure of guts (such as enterotypes) has been observed across gut microbiome samples in which guts in the same group share similar biota compositions. In the paper, a finite mixture of Dirichlet-multinomial regression models is proposed that accounts for this underlying group structure and to allow for a probabilistic investigation of the relationship between bacterial abundance and biological and/or environmental covariates within each inferred group. Furthermore, finite mixtures of regression models which incorporate the concomitant effect of the covariates on the resulting mixing proportions are also proposed and examined within the Dirichlet-multinomial framework. We utilize the proposed mixture model to gain insight on underlying subgroups in a microbiome data set comprising tumour and healthy samples and the relationships between covariates and microbial abundance in those subgroups.
引用
收藏
页码:1163 / 1187
页数:25
相关论文
共 50 条
  • [1] Batch effects correction for microbiome data with Dirichlet-multinomial regression
    Dai, Zhenwei
    Wong, Sunny H.
    Yu, Jun
    Wei, Yingying
    [J]. BIOINFORMATICS, 2019, 35 (05) : 807 - 814
  • [2] VARIABLE SELECTION FOR SPARSE DIRICHLET-MULTINOMIAL REGRESSION WITH AN APPLICATION TO MICROBIOME DATA ANALYSIS
    Chen, Jun
    Li, Hongzhe
    [J]. ANNALS OF APPLIED STATISTICS, 2013, 7 (01): : 418 - 442
  • [3] An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data
    Wadsworth, W. Duncan
    Argiento, Raffaele
    Guindani, Michele
    Galloway-Pena, Jessica
    Shelburne, Samuel A.
    Vannucci, Marina
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [4] An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data
    W. Duncan Wadsworth
    Raffaele Argiento
    Michele Guindani
    Jessica Galloway-Pena
    Samuel A. Shelburne
    Marina Vannucci
    [J]. BMC Bioinformatics, 18
  • [5] Erratum to: An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data
    W. Duncan Wadsworth
    Raffaele Argiento
    Michele Guindani
    Jessica Galloway-Pena
    Samuel A. Shelburne
    Marina Vannucci
    [J]. BMC Bioinformatics, 18
  • [6] Dirichlet-multinomial modelling outperforms alternatives for analysis of microbiome and other ecological count data
    Harrison, Joshua G.
    Calder, W. John
    Shastry, Vivaswat
    Buerkle, C. Alex
    [J]. MOLECULAR ECOLOGY RESOURCES, 2020, 20 (02) : 481 - 497
  • [7] Estimating error models for whole genome sequencing using mixtures of Dirichlet-multinomial distributions
    Wu, Steven H.
    Schwartz, Rachel S.
    Winter, David J.
    Conrad, Donald F.
    Cartwright, Reed A.
    [J]. BIOINFORMATICS, 2017, 33 (15) : 2322 - 2329
  • [8] Mixtures of logistic normal multinomial regression models for microbiome data
    Dai, Wenshu
    Fang, Yuan
    Subedi, Sanjeena
    [J]. JOURNAL OF APPLIED STATISTICS, 2024,
  • [9] An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data (vol 18, 94, 2017)
    Wadsworth, W. Duncan
    Argiento, Raffaele
    Guindani, Michele
    Galloway-Pena, Jessica
    Shelburne, Samuel A.
    Vannucci, Marina
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [10] Modeling Information Content Via Dirichlet-Multinomial Regression Analysis
    Ferrari, Alberto
    [J]. MULTIVARIATE BEHAVIORAL RESEARCH, 2017, 52 (02) : 259 - 270