Fast zero-inflated negative binomial mixed modeling approach for analyzing longitudinal metagenomics data

被引:22
|
作者
Zhang, Xinyan [1 ]
Yi, Nengjun [2 ]
机构
[1] Georgia Southern Univ, Jiann Ping Hsu Coll Publ Hlth, Dept Biostat, Statesboro, GA 30458 USA
[2] Univ Alabama Birmingham, Dept Biostat, Birmingham, AL 35294 USA
关键词
BACTERIAL VAGINOSIS;
D O I
10.1093/bioinformatics/btz973
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Longitudinal metagenomics data, including both 16S rRNA and whole-metagenome shotgun sequencing data, enhanced our abilities to understand the dynamic associations between the human microbiome and various diseases. However, analytic tools have not been fully developed to simultaneously address the main challenges of longitudinal metagenomics data, i.e. high-dimensionality, dependence among samples and zero-inflation of observed counts. Results: We propose a fast zero-inflated negative binomial mixed modeling (FZINBMM) approach to analyze high-dimensional longitudinal metagenomic count data. The FZINBMM approach is based on zero-inflated negative binomial mixed models (ZINBMMs) for modeling longitudinal metagenomic count data and a fast EM-IWLS algorithm for fitting ZINBMMs. FZINBMM takes advantage of a commonly used procedure for fitting linear mixed models, which allows us to include various types of fixed and random effects and within-subject correlation structures and quickly analyze many taxa. We found that FZINBMM remarkably outperformed in computational efficiency and was statistically comparable with two R packages, GLMMadaptive and glmmTMB, that use numerical integration to fit ZINBMMs. Extensive simulations and real data applications showed that FZINBMM outperformed other previous methods, including linear mixed models, negative binomial mixed models and zero-inflated Gaussian mixed models.
引用
收藏
页码:2345 / 2351
页数:7
相关论文
共 50 条
  • [31] On modeling zero-inflated insurance data
    Perez Sanchez, J. M.
    Gomez-Deniz, E.
    [J]. JOURNAL OF RISK MODEL VALIDATION, 2016, 10 (04): : 23 - 37
  • [32] A Marginalized Zero-Inflated Negative Binomial Model for Spatial Data: Modeling COVID-19 Deaths in Georgia
    Mutiso, Fedelis
    Pearce, John L.
    Benjamin-Neelon, Sara E.
    Mueller, Noel T.
    Li, Hong
    Neelon, Brian
    [J]. BIOMETRICAL JOURNAL, 2024, 66 (05)
  • [33] A Bayesian approach for analyzing zero-inflated clustered count data with dispersion
    Choo-Wosoba, Hyoyoung
    Gaskins, Jeremy
    Levy, Steven
    Datta, Somnath
    [J]. STATISTICS IN MEDICINE, 2018, 37 (05) : 801 - 812
  • [34] A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data
    Jiang, Shuang
    Xiao, Guanghua
    Koh, Andrew Y.
    Kim, Jiwoong
    Li, Qiwei
    Zhan, Xiaowei
    [J]. BIOSTATISTICS, 2021, 22 (03) : 522 - 540
  • [35] Geographically Weighted Zero-Inflated Negative Binomial Regression: A general case for count data
    da Silva, Alan Ricardo
    de Sousa, Marcos Douglas Rodrigues
    [J]. SPATIAL STATISTICS, 2023, 58
  • [36] Zero-inflated modeling part II: Zero-inflated models for complex data structures
    Young, Derek S.
    Roemmele, Eric S.
    Shi, Xuan
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (02)
  • [37] Zero-inflated negative binomial mixed model: an application to two microbial organisms important in oesophagitis
    Fang, R.
    Wagner, B. D.
    Harris, J. K.
    Fillon, S. A.
    [J]. EPIDEMIOLOGY AND INFECTION, 2016, 144 (11): : 2447 - 2455
  • [38] A bivariate zero-inflated negative binomial regression model for count data with excess zeros
    Wang, PM
    [J]. ECONOMICS LETTERS, 2003, 78 (03) : 373 - 378
  • [39] Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros
    Moghimbeigi, Abbas
    Eshraghian, Mohammed Reza
    Mohammad, Kazem
    McArdle, Brian
    [J]. JOURNAL OF APPLIED STATISTICS, 2008, 35 (10) : 1193 - 1202
  • [40] A joint modeling of longitudinal zero-inflated count data and time to event data
    Kim, Donguk
    Chun, Jihun
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (07) : 1459 - 1473