Bayesian variable selection for understanding mixtures in environmental exposures

被引:7
|
作者
Kowal, Daniel R. [1 ]
Bravo, Mercedes [2 ,3 ]
Leong, Henry [3 ]
Bui, Alexander [4 ]
Griffin, Robert J. [4 ]
Ensor, Katherine B. [1 ]
Miranda, Marie Lynn [3 ,5 ]
机构
[1] Rice Univ, Dept Stat, MS 138, Houston, TX 77251 USA
[2] RTI Int, Biostat & Epidemiol Div, Durham, NC USA
[3] Univ Notre Dame, Childrens Environm Hlth Initiat, Notre Dame, IN 46556 USA
[4] Rice Univ, Dept Civil & Environm Engn, Houston, TX USA
[5] Univ Notre Dame, Dept Appl & Computat Math & Stat, Notre Dame, IN 46556 USA
关键词
air quality; educational outcomes; lead; prediction; regression; RACIAL RESIDENTIAL SEGREGATION; BLOOD LEAD CONCENTRATIONS; HORSESHOE ESTIMATOR; REGRESSION; CHILDHOOD; LASSO;
D O I
10.1002/sim.9099
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Social and environmental stressors are crucial factors in child development. However, there exists a multitude of measurable social and environmental factors-the effects of which may be cumulative, interactive, or null. Using a comprehensive cohort of children in North Carolina, we study the impact of social and environmental variables on 4th end-of-grade exam scores in reading and mathematics. To identify the essential factors that predict these educational outcomes, we design new tools for Bayesian linear variable selection using decision analysis. We extract a predictive optimal subset of explanatory variables by coupling a loss function with a novel model-based penalization scheme, which leads to coherent Bayesian decision analysis and empirically improves variable selection, estimation, and prediction on simulated data. The Bayesian linear model propagates uncertainty quantification to all predictive evaluations, which is important for interpretable and robust model comparisons. These predictive comparisons are conducted out-of-sample with a customized approximation algorithm that avoids computationally intensive model refitting. We apply our variable selection techniques to identify the joint collection of social and environmental stressors-and their interactions-that offer clear and quantifiable improvements in prediction of reading and mathematics exam scores.
引用
收藏
页码:4850 / 4871
页数:22
相关论文
共 50 条
  • [1] Mixtures of g priors for Bayesian variable selection
    Liang, Feng
    Paulo, Rui
    Molina, German
    Clyde, Merlise A.
    Berger, Jim O.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (481) : 410 - 423
  • [2] Bayesian Variable Selection
    Sutton, Matthew
    [J]. CASE STUDIES IN APPLIED BAYESIAN DATA SCIENCE: CIRM JEAN-MORLET CHAIR, FALL 2018, 2020, 2259 : 121 - 135
  • [3] Bayesian scale mixtures of normals linear regression and Bayesian quantile regression with big data and variable selection
    Chu, Yuanqi
    Yin, Zhouping
    Yu, Keming
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2023, 428
  • [4] Handbook of Bayesian Variable Selection
    Ni, Yang
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (542) : 1449 - 1450
  • [5] Objective Bayesian variable selection
    Casella, G
    Moreno, E
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 157 - 167
  • [6] Approaches for Bayesian variable selection
    George, EI
    McCulloch, RE
    [J]. STATISTICA SINICA, 1997, 7 (02) : 339 - 373
  • [7] BAYESIAN CONSTRAINED VARIABLE SELECTION
    Farcomeni, Alessio
    [J]. STATISTICA SINICA, 2010, 20 (03) : 1043 - 1062
  • [8] Which environmental factors control phytoplankton populations? A Bayesian variable selection approach
    Mutshinda, Crispin M.
    Finkel, Zoe V.
    Irwin, Andrew J.
    [J]. ECOLOGICAL MODELLING, 2013, 269 : 1 - 8
  • [9] Gene selection: a Bayesian variable selection approach
    Lee, KE
    Sha, NJ
    Dougherty, ER
    Vannucci, M
    Mallick, BK
    [J]. BIOINFORMATICS, 2003, 19 (01) : 90 - 97
  • [10] On the selection consistency of Bayesian structured variable selection
    Yang, Kaixu
    Shen, Xiaoxi
    [J]. STAT, 2017, 6 (01): : 131 - 144