Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis

被引:47
|
作者
Tang, Zheng-Zheng [1 ,2 ]
Chen, Guanhua [1 ]
机构
[1] Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI 53715 USA
[2] Wisconsin Inst Discovery, Madison, WI 53715 USA
关键词
Compositional data analysis; Differential abundance; Hierarchical model; Microbiome; Score test; Zero-inflated model; FALSE DISCOVERY RATE; WIDE ASSOCIATION; SELECTION; FRAMEWORK;
D O I
10.1093/biostatistics/kxy025
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
There is heightened interest in using high-throughput sequencing technologies to quantify abundances of microbial taxa and linking the abundance to human diseases and traits. Proper modeling of multivariate taxon counts is essential to the power of detecting this association. Existing models are limited in handling excessive zero observations in taxon counts and in flexibly accommodating complex correlation structures and dispersion patterns among taxa. In this article, we develop a new probability distribution, zero-inflated generalized Dirichlet multinomial (ZIGDM), that overcomes these limitations in modeling multivariate taxon counts. Based on this distribution, we propose a ZIGDM regression model to link microbial abundances to covariates (e.g. disease status) and develop a fast expectation-maximization algorithm to efficiently estimate parameters in the model. The derived tests enable us to reveal rich patterns of variation in microbial compositions including differential mean and dispersion. The advantages of the proposed methods are demonstrated through simulation studies and an analysis of a gut microbiome dataset.
引用
收藏
页码:698 / 713
页数:16
相关论文
共 50 条
  • [41] Multilevel zero-inflated Generalized Poisson regression modeling for dispersed correlated count data
    Almasi, Afshin
    Eshraghian, Mohammad Reza
    Moghimbeigi, Abbas
    Rahimi, Abbas
    Mohammad, Kazem
    Fallahigilan, Sadegh
    [J]. STATISTICAL METHODOLOGY, 2016, 30 : 1 - 14
  • [42] BAYESIAN MIXED EFFECTS MODELS FOR ZERO-INFLATED COMPOSITIONS IN MICROBIOME DATA ANALYSIS
    Ren, Boyu
    Bacallado, Sergio
    Favaro, Stefano
    Vatanen, Tommi
    Huttenhower, Curtis
    Trippa, Lorenzo
    [J]. ANNALS OF APPLIED STATISTICS, 2020, 14 (01): : 494 - 517
  • [43] A new zero-inflated discrete Lindley regression model
    Tanis, Caner
    Koc, Haydar
    Pekgor, Ahmet
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024, 53 (12) : 4252 - 4271
  • [44] Statistical Inference in a Zero-Inflated Bell Regression Model
    Essoham Ali
    Mamadou Lamine Diop
    Aliou Diop
    [J]. Mathematical Methods of Statistics, 2022, 31 : 91 - 104
  • [45] Model selection for zero-inflated regression with missing covariates
    Chen, Xue-Dong
    Fu, Ying-Zi
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (01) : 765 - 773
  • [46] Statistical Inference in a Zero-Inflated Bell Regression Model
    Ali, Essoham
    Diop, Mamadou Lamine
    Diop, Aliou
    [J]. MATHEMATICAL METHODS OF STATISTICS, 2022, 31 (03) : 91 - 104
  • [47] Assessment and Selection of Competing Models for Zero-Inflated Microbiome Data
    Xu, Lizhen
    Paterson, Andrew D.
    Turpin, Williams
    Xu, Wei
    [J]. PLOS ONE, 2015, 10 (07):
  • [48] Hurdle Model for Longitudinal Zero-Inflated Count Data Analysis
    Jin, Iktae
    Lee, Keunbaik
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2014, 27 (06) : 923 - 932
  • [49] A Zero-Inflated Logistic Normal Multinomial Model for Extracting Microbial Compositions
    Zeng, Yanyan
    Pang, Daolin
    Zhao, Hongyu
    Wang, Tao
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (544) : 2356 - 2369
  • [50] A constrained marginal zero-inflated binomial regression model
    Ali, Essoham
    Diop, Aliou
    Dupuy, Jean-Francois
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2022, 51 (18) : 6396 - 6422