A fast bootstrap algorithm for causal inference with large data

被引:0
|
作者
Kosko, Matthew [1 ]
Wang, Lin [2 ]
Santacatterina, Michele [3 ]
机构
[1] George Washington Univ, Dept Stat, Washington, DC 20052 USA
[2] Purdue Univ, Dept Stat, W Lafayette, IN USA
[3] NYU, Dept Populat Hlth, New York, NY USA
关键词
causal bootstrap; covariate balance; machine learning; propensity score; real-world data; PROPENSITY-SCORE; INVERSE PROBABILITY; HORMONE-THERAPY; STRATEGIES; ESTIMATORS; VARIANCE;
D O I
10.1002/sim.10075
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Estimating causal effects from large experimental and observational data has become increasingly prevalent in both industry and research. The bootstrap is an intuitive and powerful technique used to construct standard errors and confidence intervals of estimators. Its application however can be prohibitively demanding in settings involving large data. In addition, modern causal inference estimators based on machine learning and optimization techniques exacerbate the computational burden of the bootstrap. The bag of little bootstraps has been proposed in non-causal settings for large data but has not yet been applied to evaluate the properties of estimators of causal effects. In this article, we introduce a new bootstrap algorithm called causal bag of little bootstraps for causal inference with large data. The new algorithm significantly improves the computational efficiency of the traditional bootstrap while providing consistent estimates and desirable confidence interval coverage. We describe its properties, provide practical considerations, and evaluate the performance of the proposed algorithm in terms of bias, coverage of the true 95% confidence intervals, and computational time in a simulation study. We apply it in the evaluation of the effect of hormone therapy on the average time to coronary heart disease using a large observational data set from the Women's Health Initiative.
引用
收藏
页码:2894 / 2927
页数:34
相关论文
共 50 条
  • [1] The Designed Bootstrap for Causal Inference in Big Observational Data
    Zhang, Yumin
    Sabbaghi, Arman
    [J]. JOURNAL OF STATISTICAL THEORY AND PRACTICE, 2021, 15 (04)
  • [2] The Designed Bootstrap for Causal Inference in Big Observational Data
    Yumin Zhang
    Arman Sabbaghi
    [J]. Journal of Statistical Theory and Practice, 2021, 15
  • [3] A Data Feature Extraction Method Based on the NOTEARS Causal Inference Algorithm
    Wang, Hairui
    Li, Junming
    Zhu, Guifu
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (14):
  • [4] Robust, Scalable, and Fast Bootstrap Method for Analyzing Large Scale Data
    Basiri, Shahab
    Ollila, Esa
    Koivunen, Visa
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2016, 64 (04) : 1007 - 1017
  • [5] Causal inference and observational data
    Ivan Olier
    Yiqiang Zhan
    Xiaoyu Liang
    Victor Volovici
    [J]. BMC Medical Research Methodology, 23
  • [6] Data integration in causal inference
    Shi, Xu
    Pan, Ziyang
    Miao, Wang
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2023, 15 (01)
  • [7] Causal inference and observational data
    Olier, Ivan
    Zhan, Yiqiang
    Liang, Xiaoyu
    Volovici, Victor
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [8] Causal inference with observational data
    Nichols, Austin
    [J]. STATA JOURNAL, 2007, 7 (04): : 507 - 541
  • [9] Fast and wild: Bootstrap inference in Stata using boottest
    Roodman, David
    MacKinnon, James G.
    Nielsen, Morten Orregaard
    Webb, Matthew D.
    [J]. STATA JOURNAL, 2019, 19 (01): : 4 - 60
  • [10] FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference
    Wang, Tianyu
    Morucci, Marco
    Awan, M. Usaid
    Liu, Yameng
    Roy, Sudeepa
    Rudin, Cynthia
    Volfovsky, Alexander
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22