Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis

被引:47
|
作者
Tang, Zheng-Zheng [1 ,2 ]
Chen, Guanhua [1 ]
机构
[1] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI 53715 USA
[2] Wisconsin Inst Discovery, Madison, WI 53715 USA
关键词
Compositional data analysis; Differential abundance; Hierarchical model; Microbiome; Score test; Zero-inflated model; FALSE DISCOVERY RATE; WIDE ASSOCIATION; SELECTION; FRAMEWORK;
D O I
10.1093/biostatistics/kxy025
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
There is heightened interest in using high-throughput sequencing technologies to quantify abundances of microbial taxa and linking the abundance to human diseases and traits. Proper modeling of multivariate taxon counts is essential to the power of detecting this association. Existing models are limited in handling excessive zero observations in taxon counts and in flexibly accommodating complex correlation structures and dispersion patterns among taxa. In this article, we develop a new probability distribution, zero-inflated generalized Dirichlet multinomial (ZIGDM), that overcomes these limitations in modeling multivariate taxon counts. Based on this distribution, we propose a ZIGDM regression model to link microbial abundances to covariates (e.g. disease status) and develop a fast expectation-maximization algorithm to efficiently estimate parameters in the model. The derived tests enable us to reveal rich patterns of variation in microbial compositions including differential mean and dispersion. The advantages of the proposed methods are demonstrated through simulation studies and an analysis of a gut microbiome dataset.
引用
收藏
页码:698 / 713
页数:16
相关论文
共 50 条
  • [1] A Bayesian zero-inflated Dirichlet-multinomial regression model for multivariate compositional count data
    Koslovsky, Matthew D. D.
    [J]. BIOMETRICS, 2023, 79 (04) : 3239 - 3251
  • [2] Compositional zero-inflated network estimation for microbiome data
    Ha, Min Jin
    Kim, Junghi
    Galloway-Pena, Jessica
    Kim-Anh Do
    Peterson, Christine B.
    [J]. BMC BIOINFORMATICS, 2020, 21 (Suppl 21)
  • [3] Compositional zero-inflated network estimation for microbiome data
    Min Jin Ha
    Junghi Kim
    Jessica Galloway-Peña
    Kim-Anh Do
    Christine B. Peterson
    [J]. BMC Bioinformatics, 21
  • [4] A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data
    Jiang, Shuang
    Xiao, Guanghua
    Koh, Andrew Y.
    Kim, Jiwoong
    Li, Qiwei
    Zhan, Xiaowei
    [J]. BIOSTATISTICS, 2021, 22 (03) : 522 - 540
  • [5] A Zero-Inflated Latent Dirichlet Allocation Model for Microbiome Studies
    Deek, Rebecca A.
    Li, Hongzhe
    [J]. FRONTIERS IN GENETICS, 2021, 11
  • [6] A Zero-Inflated Regression Model for Grouped Data
    Brown, Sarah
    Duncan, Alan
    Harris, Mark N.
    Roberts, Jennifer
    Taylor, Karl
    [J]. OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2015, 77 (06) : 822 - 831
  • [7] MarZIC: A Marginal Mediation Model for Zero-Inflated Compositional Mediators with Applications to Microbiome Data
    Wu, Quran
    O'Malley, James
    Datta, Susmita
    Gharaibeh, Raad Z.
    Jobin, Christian
    Karagas, Margaret R.
    Coker, Modupe O.
    Hoen, Anne G.
    Christensen, Brock C.
    Madan, Juliette C.
    Li, Zhigang
    [J]. GENES, 2022, 13 (06)
  • [8] The analysis of zero-inflated count data: Beyond zero-inflated Poisson regression.
    Loeys, Tom
    Moerkerke, Beatrijs
    De Smet, Olivia
    Buysse, Ann
    [J]. BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2012, 65 (01): : 163 - 180
  • [9] Principal component analysis for zero-inflated compositional data
    Kim, Kipoong
    Park, Jaesung
    Jung, Sungkyu
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 198
  • [10] A zero-inflated beta-binomial model for microbiome data analysis
    Hu, Tao
    Gallins, Paul
    Zhou, Yi-Hui
    [J]. STAT, 2018, 7 (01):