Incorporating Grouping Information in Bayesian Variable Selection with Applications in Genomics

被引:29
|
作者
Rockova, Veronika [1 ]
Lesaffre, Emmanuel [1 ,2 ]
机构
[1] Erasmus Univ, Dept Biostat, Erasmus MC, NL-3000 DR Rotterdam, Netherlands
[2] Katholieke Univ Leuven, L BioStat, Louvain, Belgium
来源
BAYESIAN ANALYSIS | 2014年 / 9卷 / 01期
关键词
Bayesian shrinkage estimation; EM algorithm; Bayesian LASSO; Minorization-maximization; NONCONCAVE PENALIZED LIKELIHOOD; MODEL SELECTION; REGRESSION; NETWORK; REGULARIZATION; EXPRESSION; PRIORS; CELLS;
D O I
10.1214/13-BA846
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In many applications it is of interest to determine a limited number of important explanatory factors (representing groups of potentially overlapping predictors) rather than original predictor variables. The often imposed requirement that the clustered predictors should enter the model simultaneously may be limiting as not all the variables within a group need to be associated with the outcome. Within-group sparsity is often desirable as well. Here we propose a Bayesian variable selection method, which uses the grouping information as a means of introducing more equal competition to enter the model within the groups rather than as a source of strict regularization constraints. This is achieved within the context of Bayesian LASSO (least absolute shrinkage and selection operator) by allowing each regression coefficient to be penalized differentially and by considering an additional regression layer to relate individual penalty parameters to a group identification matrix. The proposed hierarchical model therefore enables inference simultaneously on two levels: (1) the regression layer for the continuous outcome in relation to the predictors and (2) the regression layer for the penalty parameters in relation to the grouping information. Both situations with overlapping and non-overlapping groups are applicable. The method does not assume within-group homogeneity across the regression coefficients, which is implicit in many structured penalized likelihood approaches. The smoothness here is enforced at the penalty level rather than within the regression coefficients. To enhance the potential of the proposed method we develop two rapid computational procedures based on the expectation maximization (EM) algorithm, which offer substantial time savings in applications where the high-dimensionality renders Markov chain Monte Carlo (MCMC) approaches less practical. We demonstrate the usefulness of our method in predicting time to death in glioblastoma patients using pathways of genes.
引用
收藏
页码:221 / 257
页数:37
相关论文
共 50 条
  • [1] Incorporating grouping information into Bayesian Gaussian graphical model selection
    Dai, Wei
    Hu, Taizhong
    Jin, Baisuo
    Shi, Xiaoping
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2023, 52 (22) : 7966 - 7983
  • [2] Bayesian variable selection using Knockoffs with applications to genomics
    Yap, Jurel K.
    Gauran, Iris Ivy M.
    [J]. COMPUTATIONAL STATISTICS, 2023, 38 (04) : 1771 - 1790
  • [3] Bayesian variable selection using Knockoffs with applications to genomics
    Jurel K. Yap
    Iris Ivy M. Gauran
    [J]. Computational Statistics, 2023, 38 : 1771 - 1790
  • [4] Bayesian variable selection with graphical structure learning: Applications in integrative genomics
    Kundu, Suprateek
    Cheng, Yichen
    Shin, Minsuk
    Manyam, Ganiraju
    Mallick, Bani K.
    Baladandayuthapani, Veerabhadran
    [J]. PLOS ONE, 2018, 13 (07):
  • [5] Incorporating Grouping Information into Bayesian Decision Tree Ensembles
    Du, Junliang
    Linero, Antonio R.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [6] A graph Laplacian prior for Bayesian variable selection and grouping
    Chakraborty, Sounak
    Lozano, Aurelie C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 136 : 72 - 91
  • [7] Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces With Applications in Genomics
    Li, Fan
    Zhang, Nancy R.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (491) : 1202 - 1214
  • [8] Bayesian Variable Selection with Applications in Health Sciences
    Garcia-Donato, Gonzalo
    Castellanos, Maria Eugenia
    Quiros, Alicia
    [J]. MATHEMATICS, 2021, 9 (03) : 1 - 16
  • [9] A Bayesian Integrative Model for Genetical Genomics with Spatially Informed Variable Selection
    Cassese, Alberto
    Guindani, Michele
    Vannucci, Marina
    [J]. CANCER INFORMATICS, 2014, 13 : 29 - 37
  • [10] Incorporating Different Sources of Information for Bayesian Optimal Portfolio Selection
    Bodnar, Olha
    Bodnar, Taras
    Niklasson, Vilhelm
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2024,