Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome

被引:0
|
作者
Datta, Jyotishka [1 ]
Bandyopadhyay, Dipankar [2 ]
机构
[1] Virginia Polytech Inst & State Univ, Dept Stat, 250 Drillfield Dr, Blacksburg, VA 24061 USA
[2] Virginia Commonwealth Univ, Sch Populat Hlth, Dept Biostat, One Capital Sq,7th Floor,830 East Main St,POB 9800, Richmond, VA 23298 USA
基金
美国国家卫生研究院;
关键词
Bayesian; Compositional data; Generalized Dirichlet; Dirichlet; Large p; Shrinkage prior; Sparse probability vectors; Stick-breaking; Horseshoe; ASYMPTOTIC PROPERTIES; PRIORS; ESTIMATOR; INFERENCE; RISK;
D O I
10.1007/s41096-024-00194-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Microbiome studies generate multivariate compositional responses, such as taxa counts, which are strictly non-negative, bounded, residing within a simplex, and subject to unit-sum constraint. In presence of covariates (which can be moderate to high dimensional), they are popularly modeled via the Dirichlet-Multinomial (D-M) regression framework. In this paper, we consider a Bayesian approach for estimation and inference under a D-M compositional framework, and present a comparative evaluation of some state-of-the-art continuous shrinkage priors for efficient variable selection to identify the most significant associations between available covariates, and taxonomic abundance. Specifically, we compare the performances of the horseshoe and horseshoe+ priors (with the benchmark Bayesian lasso), utilizing Hamiltonian Monte Carlo techniques for posterior sampling, and generating posterior credible intervals. Our simulation studies using synthetic data demonstrate excellent recovery and estimation accuracy of sparse parameter regime by the continuous shrinkage priors. We further illustrate our method via application to a motivating oral microbiome data generated from the NYC-Hanes study. RStan implementation of our method is made available at the GitHub link: (https://github.com/dattahub/compshrink).
引用
收藏
页数:25
相关论文
共 50 条
  • [1] Bayesian Graphical Compositional Regression for Microbiome Data
    Mao, Jialiang
    Chen, Yuhan
    Ma, Li
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 610 - 624
  • [2] Variable selection in microbiome compositional data analysis
    Susin, Antoni
    Wang, Yiwen
    Cao, Kim-Anh Le
    Calle, M. Luz
    [J]. NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (02)
  • [3] Bayesian compositional regression with structured priors for microbiome feature selection
    Zhang, Liangliang
    Shi, Yushu
    Jenq, Robert R.
    Do, Kim-Anh
    Peterson, Christine B.
    [J]. BIOMETRICS, 2021, 77 (03) : 824 - 838
  • [4] Compositional variable selection in quantile regression for microbiome data with false discovery rate control
    Li, Runze
    Mu, Jin
    Yang, Songshan
    Ye, Cong
    Zhan, Xiang
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (02)
  • [5] A Bayesian joint model for compositional mediation effect selection in microbiome data
    Fu, Jingyan
    Koslovsky, Matthew D.
    Neophytou, Andreas M.
    Vannucci, Marina
    [J]. STATISTICS IN MEDICINE, 2023, 42 (17) : 2999 - 3015
  • [6] Application of Bayesian variable selection in logistic regression model
    Bangchang, Kannat Na
    [J]. AIMS MATHEMATICS, 2024, 9 (05): : 13336 - 13345
  • [7] VARIABLE SELECTION FOR SPARSE DIRICHLET-MULTINOMIAL REGRESSION WITH AN APPLICATION TO MICROBIOME DATA ANALYSIS
    Chen, Jun
    Li, Hongzhe
    [J]. ANNALS OF APPLIED STATISTICS, 2013, 7 (01): : 418 - 442
  • [8] REGRESSION ANALYSIS FOR MICROBIOME COMPOSITIONAL DATA
    Shi, Pixu
    Zhang, Anru
    Li, Hongzhe
    [J]. ANNALS OF APPLIED STATISTICS, 2016, 10 (02): : 1019 - 1040
  • [9] Bayesian quantile regression and variable selection for count data with an application to Youth Fitness Survey
    Lv, Jing
    Fu, Yingzi
    [J]. PROCEEDINGS OF 2016 12TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2016, : 14 - 18
  • [10] Variable selection in regression with compositional covariates
    Lin, Wei
    Shi, Pixu
    Feng, Rui
    Li, Hongzhe
    [J]. BIOMETRIKA, 2014, 101 (04) : 785 - 797