Bayesian feature selection in high-dimensional regression in presence of correlated noise

被引:1
|
作者
Feldman, Guy [1 ]
Bhadra, Anindya [1 ]
Kirshner, Sergey [1 ]
机构
[1] Purdue Univ, Dept Stat, 250 N Univ St, W Lafayette, IN 47907 USA
来源
STAT | 2014年 / 3卷 / 01期
关键词
Bayesian methods; genomics; graphical models; high-dimensional data; variable selection;
D O I
10.1002/sta4.60
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of feature selection in a high-dimensional multiple predictors, multiple responses regression setting. Assuming that regression errors are i.i.d. when they are in fact dependent leads to inconsistent and inefficient feature estimates. We relax the i.i.d. assumption by allowing the errors to exhibit a tree-structured dependence. This allows a Bayesian problem formulation with the error dependence structure treated as an auxiliary variable that can be integrated out analytically with the help of the matrix-tree theorem. Mixing over trees results in a flexible technique for modelling the graphical structure for the regression errors. Furthermore, the analytic integration results in a collapsed Gibbs sampler for feature selection that is computationally efficient. Our approach offers significant performance gains over the competing methods in simulations, especially when the features themselves are correlated. In addition to comprehensive simulation studies, we apply our method to a high-dimensional breast cancer data set to identify markers significantly associated with the disease. Copyright (C) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:258 / 272
页数:15
相关论文
共 50 条
  • [1] Sparse Bayesian variable selection in high-dimensional logistic regression models with correlated priors
    Ma, Zhuanzhuan
    Han, Zifei
    Ghosh, Souparno
    Wu, Liucang
    Wang, Min
    STATISTICAL ANALYSIS AND DATA MINING, 2024, 17 (01)
  • [2] Preconditioning for feature selection and regression in high-dimensional problems'
    Paul, Debashis
    Bair, Eric
    Hastie, Trevor
    Tibshirani, Robert
    ANNALS OF STATISTICS, 2008, 36 (04): : 1595 - 1618
  • [3] Efficient Learning and Feature Selection in High-Dimensional Regression
    Ting, Jo-Anne
    D'Souza, Aaron
    Vijayakumar, Sethu
    Schaal, Stefan
    NEURAL COMPUTATION, 2010, 22 (04) : 831 - 886
  • [4] Bayesian feature selection for high-dimensional linear regression via the Ising approximation with applications to genomics
    Fisher, Charles K.
    Mehta, Pankaj
    BIOINFORMATICS, 2015, 31 (11) : 1754 - 1761
  • [5] Fully Bayesian logistic regression with hyper-LASSO priors for high-dimensional feature selection
    Li, Longhai
    Yao, Weixin
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (14) : 2827 - 2851
  • [6] Bayesian Regression Trees for High-Dimensional Prediction and Variable Selection
    Linero, Antonio R.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (522) : 626 - 636
  • [7] Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression
    Laimighofer, Michael
    Krumsiek, Jan
    Buettner, Florian
    Theis, Fabian J.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2016, 23 (04) : 279 - 290
  • [8] Bayesian Dynamic Feature Partitioning in High-Dimensional Regression With Big Data
    Gutierrez, Rene
    Guhaniyogi, Rajarshi
    TECHNOMETRICS, 2022, 64 (02) : 224 - 240
  • [9] Stability of feature selection in classification issues for high-dimensional correlated data
    Émeline Perthame
    Chloé Friguet
    David Causeur
    Statistics and Computing, 2016, 26 : 783 - 796
  • [10] Stability of feature selection in classification issues for high-dimensional correlated data
    Perthame, Emeline
    Friguet, Chloe
    Causeur, David
    STATISTICS AND COMPUTING, 2016, 26 (04) : 783 - 796