A Comparison of Random Forest-Based Missing Imputation Methods for Covariates in Propensity Score Analysis

被引:0
|
作者
Lee, Yongseok [1 ]
Leite, Walter L. [2 ]
机构
[1] Univ Florida, Bur Econ & Business Res, 720 Southwest Second Ave Suite 150, Gainesville, FL 32611 USA
[2] Univ Florida, Sch Human Dev & Org Studies Educ, Gainesville, FL 32611 USA
关键词
propensity score analysis; missing data; multivariate imputation by chained equations; machine learning; random forests; MULTIPLE IMPUTATION; CHAINED EQUATIONS; CAUSAL INFERENCE; SENSITIVITY-ANALYSIS; MATCHING METHODS; MODELS; ASSUMPTION; ROBUSTNESS; STATISTICS; VARIABLES;
D O I
10.1037/met0000676
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Propensity score analysis (PSA) is a prominent method to alleviate selection bias in observational studies, but missing data in covariates is prevalent and must be dealt with during propensity score estimation. Through Monte Carlo simulations, this study evaluates the use of imputation methods based on multiple random forests algorithms to handle missing data in covariates: multivariate imputation by chained equations-random forest (Caliber), proximity imputation (PI), and missForest. The results indicated that PI and missForest outperformed other methods with respect to bias of average treatment effect regardless of sample size and missing mechanisms. A demonstration of these five methods with PSA to evaluate the effect of participation in center-based care on children's reading ability is provided using data from the Early Childhood Longitudinal Study, Kindergarten Class of 2010-2011.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] The performance of multiple imputation for missing covariates relative to complete case analysis
    Horton, Nicholas J.
    White, Ian R.
    Carpenter, James
    STATISTICS IN MEDICINE, 2010, 29 (12) : 1357 - 1357
  • [42] Bayesian methods for generalized linear models with covariates missing at random
    Ibrahim, JG
    Chen, MH
    Lipsitz, SR
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2002, 30 (01): : 55 - 78
  • [43] Robust propensity score weighting estimation under missing at random
    Wang, Hengfang
    Kim, Jae Kwang
    Han, Jeongseop
    Lee, Youngjo
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (02): : 2687 - 2720
  • [44] Random forest-based nowcast model for rainfall
    Shah, Nita H.
    Priamvada, Anupam
    Shukla, Bipasha Paul
    EARTH SCIENCE INFORMATICS, 2023, 16 (3) : 2391 - 2403
  • [45] Random forest-based prediction of stroke outcome
    Carlos Fernandez-Lozano
    Pablo Hervella
    Virginia Mato-Abad
    Manuel Rodríguez-Yáñez
    Sonia Suárez-Garaboa
    Iria López-Dequidt
    Ana Estany-Gestal
    Tomás Sobrino
    Francisco Campos
    José Castillo
    Santiago Rodríguez-Yáñez
    Ramón Iglesias-Rey
    Scientific Reports, 11
  • [46] RANDOM FOREST-BASED BONE SEGMENTATION IN ULTRASOUND
    Baka, Nora
    Leenstra, Sieger
    van Walsum, Theo
    ULTRASOUND IN MEDICINE AND BIOLOGY, 2017, 43 (10): : 2426 - 2437
  • [47] Random forest-based track initiation method
    Liu, Shuo
    Li, Hongbo
    Zhang, Yun
    Zou, Bin
    Zhao, Jian
    JOURNAL OF ENGINEERING-JOE, 2019, 2019 (19): : 6175 - 6179
  • [48] Propensity score and proximity matching using random forest
    Zhao, Peng
    Su, Xiaogang
    Ge, Tingting
    Fan, Juanjuan
    CONTEMPORARY CLINICAL TRIALS, 2016, 47 : 85 - 92
  • [49] Random forest-based nowcast model for rainfall
    Nita H. Shah
    Anupam Priamvada
    Bipasha Paul Shukla
    Earth Science Informatics, 2023, 16 : 2391 - 2403
  • [50] Random forest-based prediction of stroke outcome
    Fernandez-Lozano, Carlos
    Hervella, Pablo
    Mato-Abad, Virginia
    Rodriguez-Yanez, Manuel
    Suarez-Garaboa, Sonia
    Lopez-Dequidt, Iria
    Estany-Gestal, Ana
    Sobrino, Tomas
    Campos, Francisco
    Castillo, Jose
    Rodriguez-Yanez, Santiago
    Iglesias-Rey, Ramon
    SCIENTIFIC REPORTS, 2021, 11 (01)