Model Based Screening Embedded Bayesian Variable Selection for Ultra-high Dimensional Settings

被引:2
|
作者
Li, Dongjin [1 ]
Dutta, Somak [1 ]
Roy, Vivekananda [1 ]
机构
[1] Iowa State Univ, Dept Stat, Ames, IA USA
关键词
GWAS; Hierarchical model; Posterior prediction; Shrinkage; Spike and slab; Stochastic search; Subset selection; STANDARD ERRORS; REGRESSION SHRINKAGE; OPTIMIZATION; CRITERIA; PRIORS;
D O I
10.1080/10618600.2022.2074428
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We develop a Bayesian variable selection method, called SVEN, based on a hierarchical Gaussian linear model with priors placed on the regression coefficients as well as on the model space. Sparsity is achieved by using degenerate spike priors on inactive variables, whereas Gaussian slab priors are placed on the coefficients for the important predictors making the posterior probability of a model available in explicit form (up to a normalizing constant). Embedding a unique model based screening and using fast Cholesky updates, SVEN produces a highly scalable computational framework to explore gigantic model spaces, rapidly identify the regions of high posterior probabilities and make fast inference and prediction. A temperature schedule is used to further mitigate multimodal posterior distributions. The temperature value is guided by our model selection consistency results which hold even when the norm of mean effects solely due to the unimportant variables diverges. An appealing byproduct of SVEN is the construction of novel model weight adjusted prediction intervals. The performance of SVEN is demonstrated through a number of simulation experiments and a real data example from a genome wide association study with over half a million markers. Supplementary materials for this article are available online.
引用
收藏
页码:61 / 73
页数:13
相关论文
共 50 条
  • [31] Bayesian variable selection in multinomial probit model for classifying high-dimensional data
    Yang, Aijun
    Li, Yunxian
    Tang, Niansheng
    Lin, Jinguan
    COMPUTATIONAL STATISTICS, 2015, 30 (02) : 399 - 418
  • [32] Uniform joint screening for ultra-high dimensional graphical models
    Zheng, Zemin
    Shi, Haiyu
    Li, Yang
    Yuan, Hui
    JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 179
  • [33] Conditional screening for ultra-high dimensional covariates with survival outcomes
    Hyokyoung G. Hong
    Jian Kang
    Yi Li
    Lifetime Data Analysis, 2018, 24 : 45 - 71
  • [34] Adjusted feature screening for ultra-high dimensional missing response
    Zou, Liying
    Liu, Yi
    Zhang, Zhonghu
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2024, 94 (03) : 460 - 483
  • [35] Conditional screening for ultra-high dimensional covariates with survival outcomes
    Hong, Hyokyoung G.
    Kang, Jian
    Li, Yi
    LIFETIME DATA ANALYSIS, 2018, 24 (01) : 45 - 71
  • [36] Bayesian Model Selection in High-Dimensional Settings (vol 107 pg 649, 2012)
    Johnson, Valen E.
    Rossell, David
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2012, 107 (500) : 1656 - 1656
  • [37] Optimality of Graphlet Screening in high dimensional variable selection
    Jin, Jiashun
    Zhang, Cun-Hui
    Zhang, Qi
    Journal of Machine Learning Research, 2014, 15 : 2723 - 2772
  • [38] Bayesian variable selection in clustering high-dimensional data
    Tadesse, MG
    Sha, N
    Vannucci, M
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (470) : 602 - 617
  • [39] ON THE COMPUTATIONAL COMPLEXITY OF HIGH-DIMENSIONAL BAYESIAN VARIABLE SELECTION
    Yang, Yun
    Wainwright, Martin J.
    Jordan, Michael I.
    ANNALS OF STATISTICS, 2016, 44 (06): : 2497 - 2532
  • [40] On the Consistency of Bayesian Variable Selection for High Dimensional Linear Models
    Wang, Shuyun
    Luan, Yihui
    PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND NATURAL COMPUTING, VOL II, 2009, : 211 - 214