Model Based Screening Embedded Bayesian Variable Selection for Ultra-high Dimensional Settings

被引:2
|
作者
Li, Dongjin [1 ]
Dutta, Somak [1 ]
Roy, Vivekananda [1 ]
机构
[1] Iowa State Univ, Dept Stat, Ames, IA USA
关键词
GWAS; Hierarchical model; Posterior prediction; Shrinkage; Spike and slab; Stochastic search; Subset selection; STANDARD ERRORS; REGRESSION SHRINKAGE; OPTIMIZATION; CRITERIA; PRIORS;
D O I
10.1080/10618600.2022.2074428
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We develop a Bayesian variable selection method, called SVEN, based on a hierarchical Gaussian linear model with priors placed on the regression coefficients as well as on the model space. Sparsity is achieved by using degenerate spike priors on inactive variables, whereas Gaussian slab priors are placed on the coefficients for the important predictors making the posterior probability of a model available in explicit form (up to a normalizing constant). Embedding a unique model based screening and using fast Cholesky updates, SVEN produces a highly scalable computational framework to explore gigantic model spaces, rapidly identify the regions of high posterior probabilities and make fast inference and prediction. A temperature schedule is used to further mitigate multimodal posterior distributions. The temperature value is guided by our model selection consistency results which hold even when the norm of mean effects solely due to the unimportant variables diverges. An appealing byproduct of SVEN is the construction of novel model weight adjusted prediction intervals. The performance of SVEN is demonstrated through a number of simulation experiments and a real data example from a genome wide association study with over half a million markers. Supplementary materials for this article are available online.
引用
收藏
页码:61 / 73
页数:13
相关论文
共 50 条
  • [21] On Numerical Aspects of Bayesian Model Selection in High and Ultrahigh-dimensional Settings
    Johnson, Valen E.
    BAYESIAN ANALYSIS, 2013, 8 (04): : 741 - 758
  • [22] Model-free feature screening for ultra-high dimensional competing risks data
    Chen, Xiaolin
    Zhang, Yahui
    Liu, Yi
    Chen, Xiaojing
    STATISTICS & PROBABILITY LETTERS, 2020, 164
  • [23] Ultra-high dimensional variable selection with application to normative aging study: DNA methylation and metabolic syndrome
    Grace Yoon
    Yinan Zheng
    Zhou Zhang
    Haixiang Zhang
    Tao Gao
    Brian Joyce
    Wei Zhang
    Weihua Guan
    Andrea A. Baccarelli
    Wenxin Jiang
    Joel Schwartz
    Pantel S. Vokonas
    Lifang Hou
    Lei Liu
    BMC Bioinformatics, 18
  • [24] Ultra-high dimensional variable selection with application to normative aging study: DNA methylation and metabolic syndrome
    Yoon, Grace
    Zheng, Yinan
    Zhang, Zhou
    Zhang, Haixiang
    Gao, Tao
    Joyce, Brian
    Zhang, Wei
    Guan, Weihua
    Baccarelli, Andrea A.
    Jiang, Wenxin
    Schwartz, Joel
    Vokonas, Pantel S.
    Hou, Lifang
    Liu, Lei
    BMC BIOINFORMATICS, 2017, 18
  • [25] Robust model-free feature screening based on modified Hoeffding measure for ultra-high dimensional data
    Yu, Yuan
    He, Di
    Zhou, Yong
    STATISTICS AND ITS INTERFACE, 2018, 11 (03) : 473 - 489
  • [26] Two Tales of Variable Selection for High Dimensional Regression: Screening and Model Building
    Liu, Cong
    Shi, Tao
    Lee, Yoonkyung
    STATISTICAL ANALYSIS AND DATA MINING, 2014, 7 (02) : 140 - 159
  • [27] Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data
    Lee, Kyu Ha
    Chakraborty, Sounak
    Sun, Jianguo
    INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2011, 7 (01):
  • [28] Posterior model consistency in high-dimensional Bayesian variable selection with arbitrary priors
    Hua, Min
    Goh, Gyuhyeong
    STATISTICS & PROBABILITY LETTERS, 2025, 223
  • [29] Bayesian variable selection and model averaging in high-dimensional multinomial nonparametric regression
    Yau, P
    Kohn, R
    Wood, S
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (01) : 23 - 54
  • [30] Bayesian variable selection in multinomial probit model for classifying high-dimensional data
    Aijun Yang
    Yunxian Li
    Niansheng Tang
    Jinguan Lin
    Computational Statistics, 2015, 30 : 399 - 418