Random Forests Approach for Causal Inference with Clustered Observational Data

被引:10
|
作者
Suk, Youmi [1 ]
Kang, Hyunseung [2 ]
Kim, Jee-Seon [1 ]
机构
[1] Univ Wisconsin Madison, Dept Educ Psychol, Madison, WI 53706 USA
[2] Univ Wisconsin Madison, Dept Stat, Madison, WI USA
关键词
Causal inference; machine learning methods; multilevel propensity score matching; multilevel observational data; hierarchical linear modeling; PROPENSITY SCORE ESTIMATION; SELECTION BIAS; STRATIFICATION; REGRESSION; IMPACT;
D O I
10.1080/00273171.2020.1808437
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level settings where all individuals are independent of each other and there is little work in using these methods with clustered or nested data, a common setting in education studies. This paper investigates using one particular ML method based on random forests known as Causal Forests to estimate treatment effects in multilevel observational data. We conduct simulation studies under different types of multilevel data, including two-level, three-level, and cross-classified data. Our simulation study shows that when the ML method is supplemented with estimated propensity scores from multilevel models that account for clustered/hierarchical structure, the modified ML method outperforms preexisting methods in a wide variety of settings. We conclude by estimating the effect of private math lessons in the Trends in International Mathematics and Science Study data, a large-scale educational assessment where students are nested within schools.
引用
收藏
页码:829 / 852
页数:24
相关论文
共 50 条
  • [41] Tuning Random Forests for Causal Inference under Cluster-Level Unmeasured Confounding
    Suk, Youmi
    Kang, Hyunseung
    MULTIVARIATE BEHAVIORAL RESEARCH, 2023, 58 (02) : 408 - 440
  • [42] Adaptive Multi-Source Causal Inference from Observational Data
    Thanh Vinh Vo
    Wei, Pengfei
    Trong Nghia Hoang
    Leong, Tze-Yun
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1975 - 1985
  • [43] Target Trial Emulation A Framework for Causal Inference From Observational Data
    Hernan, Miguel A.
    Wang, Wei
    Leaf, David E.
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2022, 328 (24): : 2446 - 2447
  • [44] Discovering Ancestral Instrumental Variables for Causal Inference From Observational Data
    Cheng, Debo
    Li, Jiuyong
    Liu, Lin
    Yu, Kui
    Le, Thuc Duy
    Liu, Jixue
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11542 - 11552
  • [45] Random Allocation in Observational Data How Small But Robust Effects Could Facilitate Hypothesis-free Causal Inference
    Smith, George Davey
    EPIDEMIOLOGY, 2011, 22 (04) : 460 - 463
  • [46] Causal Inference in Observational Studies in Surgery
    Harrison, Ewen M.
    O'Neill, Stephen
    Wigmore, Stephen J.
    Garden, O. James
    ANNALS OF SURGERY, 2015, 262 (01) : E32 - E32
  • [47] CAUSAL INFERENCE IN OBSERVATIONAL STUDIES: IS IT A FAD?
    Xue, Q.
    Tian, J.
    GERONTOLOGIST, 2013, 53 : 30 - 30
  • [48] Bayesian inference of causal effects from observational data in Gaussian graphical models
    Castelletti, Federico
    Consonni, Guido
    BIOMETRICS, 2021, 77 (01) : 136 - 149
  • [49] Four targets: an enhanced framework for guiding causal inference from observational data
    Lu, Haidong
    Li, Fan
    Lesko, Catherine R.
    Fink, David S.
    Rudolph, Kara E.
    Harhay, Michael O.
    Rentsch, Christopher T.
    Fiellin, David A.
    Gonsalves, Gregg S.
    INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2025, 54 (01)
  • [50] AN APPLICATION OF CAUSAL INFERENCE FOR OBSERVATIONAL DATA IN HARD DISK DRIVE FAILURE ANALYSIS
    Zhang, Shaoang
    Ou, Eve
    Limando, Alvin
    16TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, 2010, : 65 - +