Leveraging random assignment to impute missing covariates in causal studies

被引:2
|
作者
Kamat, Gauri [1 ]
Reiter, Jerome P. [2 ]
机构
[1] Brown Univ, Dept Biostat, Providence, RI 02912 USA
[2] Duke Univ, Dept Stat Sci, Durham, NC USA
基金
美国国家科学基金会;
关键词
Experiment; trial; missing; imputation; non-ignorable; randomization; MULTIPLE IMPUTATION; BAYESIAN-INFERENCE; MODELS; IDENTIFIABILITY; NONCOMPLIANCE; REGRESSION; DESIGN; VALUES;
D O I
10.1080/00949655.2020.1849217
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Baseline covariates in randomized experiments are often used in the estimation of treatment effects, for example, when estimating treatment effects within covariate-defined subgroups. In practice, however, covariate values may be missing for some data subjects. To handle missing values, analysts can use imputation methods to create completed datasets, from which they can estimate treatment effects. Common imputation methods include mean imputation, single imputation via regression, and multiple imputation. For each of these methods, we investigate the benefits of leveraging randomized treatment assignment in the imputation routines, that is, making use of the fact that the true covariate distributions are the same across treatment arms. We do so using simulation studies that compare the quality of inferences when we respect or disregard the randomization. We consider this question for imputation routines implemented using covariates only, and imputation routines implemented using the outcome variable. In either case, accounting for randomization offers only small gains in accuracy for our simulation scenarios. Our results also shed light on the performances of these different procedures for imputing missing covariates in randomized experiments when one seeks to estimate heterogeneous treatment effects.
引用
收藏
页码:1275 / 1305
页数:31
相关论文
共 50 条
  • [1] Bayesian nonparametric generative models for causal inference with missing at random covariates
    Roy, Jason
    Lum, Kirsten J.
    Zeldow, Bret
    Dworkin, Jordan D.
    Re, Vincent Lo
    Daniels, Michael J.
    [J]. BIOMETRICS, 2018, 74 (04) : 1193 - 1202
  • [2] Random forest with Random projection to impute missing gene expression data
    Gondara, Lovedeep
    [J]. 2015 IEEE 14TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2015, : 1251 - 1256
  • [3] Cox Regression with Covariates Missing Not at Random
    Cook V.J.
    Hu X.J.
    Swartz T.B.
    [J]. Statistics in Biosciences, 2011, 3 (2) : 208 - 222
  • [4] QUANTILE REGRESSION WITH COVARIATES MISSING AT RANDOM
    Wei, Ying
    Yang, Yunwen
    [J]. STATISTICA SINICA, 2014, 24 (03) : 1277 - 1299
  • [5] Regression Analysis with Covariates Missing at Random: A Piece-wise Nonparametric Model for Missing Covariates
    Zhao, Yang
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2009, 38 (20) : 3736 - 3744
  • [6] Weighted expectile regression with covariates missing at random
    Pan, Yingli
    Liu, Zhan
    Song, Guangyu
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (03) : 1057 - 1076
  • [7] Model averaging with covariates that are missing completely at random
    Zhang, Xinyu
    [J]. ECONOMICS LETTERS, 2013, 121 (03) : 360 - 363
  • [8] Causal Inference with Noisy and Missing Covariates via Matrix Factorization
    Kallus, Nathan
    Mao, Xiaojie
    Udell, Madeleine
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [9] Distributed estimation for linear regression with covariates missing at random
    Pan, Yingli
    Wang, Haoyu
    Xu, Kaidong
    Huang, He
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023,
  • [10] Causal inference with confounders missing not at random
    Yang, S.
    Wang, L.
    Ding, P.
    [J]. BIOMETRIKA, 2019, 106 (04) : 875 - 888