Invited Commentary: Demystifying Statistical Inference When Using Machine Learning in Causal Research

被引:0
|
作者
Balzer, Laura B. [1 ,2 ]
Westling, Ted [3 ]
机构
[1] Univ Massachusetts Amherst, Dept Biostat & Epidemiol, 427 Arnold House, Amherst, MA 01003 USA
[2] Univ Massachusetts Amherst, Dept Biostat & Epidemiol, Amherst, MA USA
[3] Univ Massachusetts Amherst, Dept Math & Stat, Amherst, MA USA
关键词
causal inference; cross-fitting; cross-validation; doubly robust; machine learning; nonparametric; Super Learner; TMLE;
D O I
暂无
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
In this issue, Naimi et al. (Am J Epidemiol. XXXX;XXX(XX):XXXX-XXXX) discuss a critical topic in public health and beyond: obtaining valid statistical inference when using machine learning in causal research. In doing so, the authors review recent prominent methodological work and recommend: 1) doubly robust estimators, such as targeted maximum likelihood estimation (TMLE); 2) ensemble methods, such as Super Learner, to combine predictions from a diverse library of algorithms; and 3) sample splitting to reduce bias and improve inference. We largely agree with these recommendations. In this commentary, we highlight the critical importance of the Super Learner library. Specifically, in both simulation settings considered by the authors, we demonstrate that reductions in bias and improvements in confidence-interval coverage can be achieved using TMLE without sample splitting and with a Super Learner library that excludes tree-based methods but includes regression splines. Whether extremely data-adaptive algorithms and sample splitting are needed depends on the specific problem and should be informed by simulations reflecting the specific application. More research is needed on practical recommendations for selecting among these options in common situations arising in epidemiology.
引用
下载
收藏
页数:5
相关论文
共 50 条
  • [31] The Seven Tools of Causal Inference, with Reflections on Machine Learning
    Pearl, Judea
    COMMUNICATIONS OF THE ACM, 2019, 62 (03) : 54 - 60
  • [32] Machine learning for causal inference in a realistic prescribing scenario
    Breskin, Alexander
    Zivich, Paul N.
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2020, 29 : 369 - 369
  • [33] Correction to: Machine Learning in Causal Inference: Application in Pharmacovigilance
    Yiqing Zhao
    Yue Yu
    Hanyin Wang
    Yikuan Li
    Yu Deng
    Guoqian Jiang
    Yuan Luo
    Drug Safety, 2022, 45 : 927 - 927
  • [34] When and how to adjust statistical forecasts in supply chains? Insight from causal machine learning
    Wibowo, Budhi S. S.
    JOURNAL OF BUSINESS ANALYTICS, 2024, 7 (01) : 25 - 41
  • [35] Machine learning for time series: from forecasting to causal inference
    Bontempi, Gianluca
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [36] Machine Learning for Causal Inference in Biological Networks: Perspectives of This Challenge
    Lecca, Paola
    FRONTIERS IN BIOINFORMATICS, 2021, 1
  • [37] Real-World Evidence, Causal Inference, and Machine Learning
    Crown, William H.
    VALUE IN HEALTH, 2019, 22 (05) : 587 - 592
  • [38] Causal inference and counterfactual prediction in machine learning for actionable healthcare
    Mattia Prosperi
    Yi Guo
    Matt Sperrin
    James S. Koopman
    Jae S. Min
    Xing He
    Shannan Rich
    Mo Wang
    Iain E. Buchan
    Jiang Bian
    Nature Machine Intelligence, 2020, 2 : 369 - 375
  • [39] Causal inference and counterfactual prediction in machine learning for actionable healthcare
    Prosperi, Mattia
    Guo, Yi
    Sperrin, Matt
    Koopman, James S.
    Min, Jae S.
    He, Xing
    Rich, Shannan
    Wang, Mo
    Buchan, Iain E.
    Bian, Jiang
    NATURE MACHINE INTELLIGENCE, 2020, 2 (07) : 369 - 375
  • [40] Special issue: Recent developments in causal inference and machine learning
    Shohei Shimizu
    Shuichi Kawano
    Behaviormetrika, 2022, 49 (2) : 275 - 276