Cluster analysis of microbiome data by using mixtures of Dirichlet-multinomial regression models

被引:9
|
作者
Subedi, Sanjeena [1 ]
Neish, Drew [2 ]
Bak, Stephen [2 ]
Feng, Zeny [2 ]
机构
[1] SUNY Binghamton, Binghamton, NY USA
[2] Univ Guelph, Guelph, ON, Canada
关键词
Cluster analysis; Dirichlet-multinomial regression; Expectation-maximization algorithm; Microbiome data; Mixture models; Operational taxonomic unit; GUT MICROBIOTA; MAXIMUM-LIKELIHOOD; CANCER; HOST;
D O I
10.1111/rssc.12432
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The human gut microbiome is one of the fundamental components of our physiology, and exploring the relationship between biological and environmental covariates and the resulting taxonomic composition of a given microbial community is an active area of research. Previously, a Dirichlet-multinomial regression framework has been suggested to model this relationship, but it did not account for any underlying latent group structure. An underlying group structure of guts (such as enterotypes) has been observed across gut microbiome samples in which guts in the same group share similar biota compositions. In the paper, a finite mixture of Dirichlet-multinomial regression models is proposed that accounts for this underlying group structure and to allow for a probabilistic investigation of the relationship between bacterial abundance and biological and/or environmental covariates within each inferred group. Furthermore, finite mixtures of regression models which incorporate the concomitant effect of the covariates on the resulting mixing proportions are also proposed and examined within the Dirichlet-multinomial framework. We utilize the proposed mixture model to gain insight on underlying subgroups in a microbiome data set comprising tumour and healthy samples and the relationships between covariates and microbial abundance in those subgroups.
引用
收藏
页码:1163 / 1187
页数:25
相关论文
共 50 条
  • [32] DIRICHLET-TREE MULTINOMIAL MIXTURES FOR CLUSTERING MICROBIOME COMPOSITIONS
    Mao, Jialiang
    Ma, Li
    [J]. ANNALS OF APPLIED STATISTICS, 2022, 16 (03): : 1476 - 1499
  • [33] Failure Mode Effects & Criticality Analysis (FMECA) using Bayesian Dirichlet-multinomial conjugate pair
    Baun, W.
    [J]. SAFETY AND RELIABILITY - SAFE SOCIETIES IN A CHANGING WORLD, 2018, : 731 - 739
  • [34] Practical perfect sampling using composite bounding chains: the Dirichlet-multinomial model
    Stein, Nathan M.
    Meng, Xiao-Li
    [J]. BIOMETRIKA, 2013, 100 (04) : 817 - 830
  • [35] Model-based estimates of effective sample size in stock assessment models using the Dirichlet-multinomial distribution
    Thorson, James T.
    Johnson, Kelli F.
    Methot, Richard D.
    Taylor, Ian G.
    [J]. FISHERIES RESEARCH, 2017, 192 : 84 - 93
  • [36] Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics
    Holmes, Ian
    Harris, Keith
    Quince, Christopher
    [J]. PLOS ONE, 2012, 7 (02):
  • [37] A Logistic Normal Multinomial Regression Model for Microbiome Compositional Data Analysis
    Xia, Fan
    Chen, Jun
    Fung, Wing Kam
    Li, Hongzhe
    [J]. BIOMETRICS, 2013, 69 (04) : 1053 - 1063
  • [38] Imprecise Probabilistic Prediction for Categorical Data: From Bayesian Inference to the Imprecise Dirichlet-Multinomial Model
    Bernard, Jean-Marc
    [J]. SOFT METHODS FOR HANDLING VARIABILITY AND IMPRECISION, 2008, 48 : 3 - 9
  • [39] Dirichlet negative multinomial regression for overdispersed correlated count data
    Farewell, Daniel M.
    Farewell, Vernon T.
    [J]. BIOSTATISTICS, 2013, 14 (02) : 395 - 404
  • [40] ESTIMATION OF EFFECTIVE SAMPLE SIZE FOR CATCH-AT-AGE AND CATCH-AT-LENGTH DATA USING SIMULATED DATA FROM THE DIRICHLET-MULTINOMIAL DISTRIBUTION
    Candy, S. G.
    [J]. CCAMLR SCIENCE, 2008, 15 : 115 - 138