Bayesian Variable Selection Under Collinearity

被引:33
|
作者
Ghosh, Joyee [1 ]
Ghattas, Andrew E. [2 ]
机构
[1] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USA
[2] Univ Iowa, Dept Biostat, Iowa City, IA 52242 USA
来源
AMERICAN STATISTICIAN | 2015年 / 69卷 / 03期
关键词
Bayesian model averaging; Linear regression; Marginal inclusion probability; Median probability model; Multimodality; Zellner's g-prior; PRIORS; REGULARIZATION;
D O I
10.1080/00031305.2015.1031827
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article, we highlight some interesting facts about Bayesian variable selection methods for linear regression models in settings where the design matrix exhibits strong collinearity. We first demonstrate via real data analysis and simulation studies that summaries of the posterior distribution based on marginal and joint distributions may give conflicting results for assessing the importance of strongly correlated covariates. The natural question is which one should be used in practice. The simulation studies suggest that posterior inclusion probabilities and Bayes factors that evaluate the importance of correlated covariates jointly are more appropriate, and some priors may be more adversely affected in such a setting. To obtain a better understanding behind the phenomenon, we study some toy examples with Zellner's g-prior. The results show that strong collinearity may lead to a multimodal posterior distribution over models, in which joint summaries are more appropriate than marginal summaries. Thus, we recommend a routine examination of the correlation matrix and calculation of the joint inclusion probabilities for correlated covariates, in addition to marginal inclusion probabilities, for assessing the importance of covariates in Bayesian variable selection.
引用
收藏
页码:165 / 173
页数:9
相关论文
共 50 条
  • [21] On efficient calculations for Bayesian variable selection
    Ruggieri, Eric
    Lawrence, Charles E.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) : 1319 - 1332
  • [22] Bayesian variable selection for logistic regression
    Tian, Yiqing
    Bondell, Howard D.
    Wilson, Alyson
    STATISTICAL ANALYSIS AND DATA MINING, 2019, 12 (05) : 378 - 393
  • [23] Sandwich algorithms for Bayesian variable selection
    Ghosh, Joyee
    Tan, Aixin
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 81 : 76 - 88
  • [24] Multivariate Bayesian variable selection and prediction
    Brown, PJ
    Vannucci, M
    Fearn, T
    MINING AND MODELING MASSIVE DATA SETS IN SCIENCE, ENGINEERING, AND BUSINESS WITH A SUBTHEME IN ENVIRONMENTAL STATISTICS, 1997, 29 (01): : 271 - 271
  • [25] CONSISTENCY OF BAYESIAN PROCEDURES FOR VARIABLE SELECTION
    Casella, George
    Giron, F. Javier
    Martinez, M. Lina
    Moreno, Elias
    ANNALS OF STATISTICS, 2009, 37 (03): : 1207 - 1228
  • [26] Bayesian variable selection with related predictors
    Chipman, H
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1996, 24 (01): : 17 - 36
  • [27] DIAGNOSIS AND TREATMENT OF COLLINEARITY PROBLEMS AND VARIABLE SELECTION IN LEAST-SQUARES MODELS
    MCGIFFEN, ME
    CARMER, SG
    RUESINK, WG
    JOURNAL OF ECONOMIC ENTOMOLOGY, 1988, 81 (05) : 1265 - 1270
  • [28] Testing for Collinearity using Bayesian Analysis
    Assaf, A. George
    Tsionas, Mike
    JOURNAL OF HOSPITALITY & TOURISM RESEARCH, 2021, 45 (06) : 1131 - 1141
  • [29] A note on consistency of Bayesian high-dimensional variable selection under a default prior
    Hua, Min
    Goh, Gyuhyeong
    STAT, 2020, 9 (01):
  • [30] Reducing collinearity by reforming spectral lines with two-dimensional variable selection method
    Luo, Yongshun
    Li, Gang
    Chen, Xu
    Lin, Ling
    JOURNAL OF MOLECULAR STRUCTURE, 2022, 1269