Random Forests Approach for Causal Inference with Clustered Observational Data

被引:10
|
作者
Suk, Youmi [1 ]
Kang, Hyunseung [2 ]
Kim, Jee-Seon [1 ]
机构
[1] Univ Wisconsin Madison, Dept Educ Psychol, Madison, WI 53706 USA
[2] Univ Wisconsin Madison, Dept Stat, Madison, WI USA
关键词
Causal inference; machine learning methods; multilevel propensity score matching; multilevel observational data; hierarchical linear modeling; PROPENSITY SCORE ESTIMATION; SELECTION BIAS; STRATIFICATION; REGRESSION; IMPACT;
D O I
10.1080/00273171.2020.1808437
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
There is a growing interest in using machine learning (ML) methods for causal inference due to their (nearly) automatic and flexible ability to model key quantities such as the propensity score or the outcome model. Unfortunately, most ML methods for causal inference have been studied under single-level settings where all individuals are independent of each other and there is little work in using these methods with clustered or nested data, a common setting in education studies. This paper investigates using one particular ML method based on random forests known as Causal Forests to estimate treatment effects in multilevel observational data. We conduct simulation studies under different types of multilevel data, including two-level, three-level, and cross-classified data. Our simulation study shows that when the ML method is supplemented with estimated propensity scores from multilevel models that account for clustered/hierarchical structure, the modified ML method outperforms preexisting methods in a wide variety of settings. We conclude by estimating the effect of private math lessons in the Trends in International Mathematics and Science Study data, a large-scale educational assessment where students are nested within schools.
引用
收藏
页码:829 / 852
页数:24
相关论文
共 50 条
  • [21] The Designed Bootstrap for Causal Inference in Big Observational Data
    Yumin Zhang
    Arman Sabbaghi
    Journal of Statistical Theory and Practice, 2021, 15
  • [22] A flexible approach for causal inference with multiple treatments and clustered survival outcomes
    Hu, Liangyuan
    Ji, Jiayi
    Ennis, Ronald D.
    Hogan, Joseph W.
    STATISTICS IN MEDICINE, 2022, 41 (25) : 4982 - 4999
  • [23] CAUSAL INFERENCE FROM OBSERVATIONAL STUDIES WITH CLUSTERED INTERFERENCE, WITH APPLICATION TO A CHOLERA VACCINE STUDY
    Barkley, Brian G.
    Hudgens, Michael G.
    Clemens, John D.
    Ali, Mohammad
    Emch, Michael E.
    ANNALS OF APPLIED STATISTICS, 2020, 14 (03): : 1432 - 1448
  • [24] Bayesian doubly robust estimation of causal effects for clustered observational data
    Zhou, Qi
    He, Haonan
    Zhao, Jie
    Song, Joon Jin
    JOURNAL OF APPLIED STATISTICS, 2025,
  • [25] Combining observational and experimental data for causal inference considering data privacy
    Mann, Charlotte Z.
    Sales, Adam C.
    Gagnon-Bartsch, Johann A.
    JOURNAL OF CAUSAL INFERENCE, 2025, 13 (01)
  • [26] Inference for clustered data
    Lee, Chang Hyung
    Steigerwald, Douglas G.
    STATA JOURNAL, 2018, 18 (02): : 447 - 460
  • [27] Causal inference with observational data: A tutorial on propensity score analysis
    Narita, Kaori
    Tena, J. D.
    Detotto, Claudio
    LEADERSHIP QUARTERLY, 2023, 34 (03):
  • [28] Causal inference from observational data and target trial emulation
    Jafarzadeh, S. R.
    Neogi, T.
    OSTEOARTHRITIS AND CARTILAGE, 2022, 30 (11) : 1415 - 1417
  • [29] Causal Inference Methods for Intergenerational Research Using Observational Data
    Frach, Leonard
    Jami, Eshim S. S.
    McAdams, Tom A. A.
    Dudbridge, Frank
    Pingault, Jean-Baptiste
    PSYCHOLOGICAL REVIEW, 2023, 130 (06) : 1688 - 1703
  • [30] CAUSAL INFERENCE FROM OBSERVATIONAL DATA - A REVIEW OF ENDS AND MEANS
    WOLD, H
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-GENERAL, 1956, 119 (01): : 28 - 50