Data-driven confounder selection via Markov and Bayesian networks

被引:25
|
作者
Haggstrom, Jenny [1 ]
机构
[1] Umea Univ, Dept Stat, USBE, SE-90187 Umea, Sweden
基金
瑞典研究理事会;
关键词
Bayesian networks; Causal inference; Confounding; Covariate selection; Markov networks; Matching; TMLE; CAUSAL; MODELS; INFERENCE; SOFTWARE; PACKAGE; BALANCE; DESIGN;
D O I
10.1111/biom.12788
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
To unbiasedly estimate a causal effect on an outcome unconfoundedness is often assumed. If there is sufficient knowledge on the underlying causal structure then existing confounder selection criteria can be used to select subsets of the observed pretreatment covariates, X, sufficient for unconfoundedness, if such subsets exist. Here, estimation of these target subsets is considered when the underlying causal structure is unknown. The proposed method is to model the causal structure by a probabilistic graphical model, for example, a Markov or Bayesian network, estimate this graph from observed data and select the target subsets given the estimated graph. The approach is evaluated by simulation both in a high-dimensional setting where unconfoundedness holds given X and in a setting where unconfoundedness only holds given subsets of X. Several common target subsets are investigated and the selected subsets are compared with respect to accuracy in estimating the average causal effect. The proposed method is implemented with existing software that can easily handle high-dimensional data, in terms of large samples and large number of covariates. The results from the simulation study show that, if unconfoundedness holds given X, this approach is very successful in selecting the target subsets, outperforming alternative approaches based on random forests and LASSO, and that the subset estimating the target subset containing all causes of outcome yields smallest MSE in the average causal effect estimation.
引用
收藏
页码:389 / 398
页数:10
相关论文
共 50 条
  • [1] Rejoinder to Discussions on: Data-driven confounder selection via Markov and Bayesian networks
    Haeggstroem, Jenny
    [J]. BIOMETRICS, 2018, 74 (02) : 407 - 410
  • [2] Discussion of "Data-driven confounder selection via Markov and Bayesian networks" by Haggstrom
    Richardson, Thomas S.
    Robins, James M.
    Wang, Linbo
    [J]. BIOMETRICS, 2018, 74 (02) : 403 - 406
  • [3] Discussion of "Data-driven confounder selection via Markov and Bayesian networks" by Jenny Haggstrom
    Kennedy, Edward H.
    Balakrishnan, Sivaraman
    [J]. BIOMETRICS, 2018, 74 (02) : 399 - 402
  • [4] Bayesian principal component regression with data-driven component selection
    Wang, Liuxia
    [J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (06) : 1177 - 1189
  • [5] A data-driven selection of the number of clusters in the Dirichlet allocation model via Bayesian mixture modelling
    Saraiva, E. F.
    Pereira, C. A. B.
    Suzuki, A. K.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (15) : 2848 - 2870
  • [6] Bayesian Imaging with Data-Driven Priors Encoded by Neural Networks*
    Holden, Matthew
    Pereyra, Marcelo
    Zygalakis, Konstantinos C.
    [J]. SIAM JOURNAL ON IMAGING SCIENCES, 2022, 15 (02): : 892 - 924
  • [7] Data-driven selection of constitutive models via rheology-informed neural networks (RhINNs)
    Saadat, Milad
    Mahmoudabadbozchelou, Mohammadamin
    Jamali, Safa
    [J]. RHEOLOGICA ACTA, 2022, 61 (10) : 721 - 732
  • [8] Data-driven selection of constitutive models via rheology-informed neural networks (RhINNs)
    Milad Saadat
    Mohammadamin Mahmoudabadbozchelou
    Safa Jamali
    [J]. Rheologica Acta, 2022, 61 : 721 - 732
  • [9] Data-Driven Sparse Structure Selection for Deep Neural Networks
    Huang, Zehao
    Wang, Naiyan
    [J]. COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 317 - 334
  • [10] Delay Propagation in Large Railway Networks with Data-Driven Bayesian Modeling
    Li, Boyu
    Guo, Ting
    Li, Ruimin
    Wang, Yang
    Ou, Yuming
    Chen, Fang
    [J]. TRANSPORTATION RESEARCH RECORD, 2021, 2675 (11) : 472 - 485