Model Based Screening Embedded Bayesian Variable Selection for Ultra-high Dimensional Settings

被引:2
|
作者
Li, Dongjin [1 ]
Dutta, Somak [1 ]
Roy, Vivekananda [1 ]
机构
[1] Iowa State Univ, Dept Stat, Ames, IA USA
关键词
GWAS; Hierarchical model; Posterior prediction; Shrinkage; Spike and slab; Stochastic search; Subset selection; STANDARD ERRORS; REGRESSION SHRINKAGE; OPTIMIZATION; CRITERIA; PRIORS;
D O I
10.1080/10618600.2022.2074428
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We develop a Bayesian variable selection method, called SVEN, based on a hierarchical Gaussian linear model with priors placed on the regression coefficients as well as on the model space. Sparsity is achieved by using degenerate spike priors on inactive variables, whereas Gaussian slab priors are placed on the coefficients for the important predictors making the posterior probability of a model available in explicit form (up to a normalizing constant). Embedding a unique model based screening and using fast Cholesky updates, SVEN produces a highly scalable computational framework to explore gigantic model spaces, rapidly identify the regions of high posterior probabilities and make fast inference and prediction. A temperature schedule is used to further mitigate multimodal posterior distributions. The temperature value is guided by our model selection consistency results which hold even when the norm of mean effects solely due to the unimportant variables diverges. An appealing byproduct of SVEN is the construction of novel model weight adjusted prediction intervals. The performance of SVEN is demonstrated through a number of simulation experiments and a real data example from a genome wide association study with over half a million markers. Supplementary materials for this article are available online.
引用
收藏
页码:61 / 73
页数:13
相关论文
共 50 条
  • [41] Bayesian variable selection for high-dimensional rank data
    Cui, Can
    Singh, Susheela P.
    Staicu, Ana-Maria
    Reich, Brian J.
    ENVIRONMETRICS, 2021, 32 (07)
  • [42] Monte Carlo Tree Search based Variable Selection for High Dimensional Bayesian Optimization
    Song, Lei
    Xue, Ke
    Huang, Xiaobin
    Qian, Chao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [43] High-Dimensional Variable Selection for Quantile Regression Based on Variational Bayesian Method
    Dai, Dengluan
    Tang, Anmin
    Ye, Jinli
    MATHEMATICS, 2023, 11 (10)
  • [44] Model-free conditional feature screening for ultra-high dimensional right censored data
    Chen, Xiaolin
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (12) : 2425 - 2446
  • [45] Variable selection for ultra-high-dimensional logistic models
    Du, Pang
    Wu, Pan
    Liang, Hua
    PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, 2014, 622 : 141 - 158
  • [46] Sparse Bayesian variable selection in kernel probit model for analyzing high-dimensional data
    Yang, Aijun
    Tian, Yuzhu
    Li, Yunxian
    Lin, Jinguan
    COMPUTATIONAL STATISTICS, 2020, 35 (01) : 245 - 258
  • [47] Sparse Bayesian variable selection in kernel probit model for analyzing high-dimensional data
    Aijun Yang
    Yuzhu Tian
    Yunxian Li
    Jinguan Lin
    Computational Statistics, 2020, 35 : 245 - 258
  • [48] Variable selection for model-based high-dimensional clustering
    Wang, Sijian
    Zhu, Ji
    PREDICTION AND DISCOVERY, 2007, 443 : 177 - +
  • [49] On correlation rank screening for ultra-high dimensional competing risks data
    Chen, Xiaolin
    Li, Chenguang
    Zhang, Tao
    Gao, Zhenlong
    JOURNAL OF APPLIED STATISTICS, 2022, 49 (07) : 1848 - 1864
  • [50] A screening method for ultra-high dimensional features with overlapped partition structures
    He, Jie
    Song, Jiali
    Zhou, Xiao-Hua
    Hou, Yan
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2023, 32 (01) : 22 - 40