Efficient and doubly robust imputation for covariate-dependent missing responses

被引:52
|
作者
Qin, Jing [1 ]
Shao, Jun [2 ]
Zhang, Biao [3 ]
机构
[1] NIAID, Biostat Res Branch, NIH, Bethesda, MD 20892 USA
[2] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
[3] Univ Toledo, Dept Math, Toledo, OH 43606 USA
基金
美国国家科学基金会;
关键词
covariate-dependent missing mechanism; doubly robust; imputation; local efficiency; model-assisted;
D O I
10.1198/016214508000000238
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article we study a well-known response missing-data problem. Missing data is an ubiquitous problem in medical and social science studies. Imputation is one of the most popular methods for dealing with missing data. The most commonly used imputation that makes use of covariates is regression imputation, in which the regression model can be parametric, semiparametric, or nonparametric. Parametric regression imputation is efficient but is not robust against misspecification of the regression model. Although nonparametric regression imputation (such as nearest-neighbor imputation and kernel regression imputation) is model-free, it is not efficient, especially if the dimension of covariate vector is high (the well-known problem of curse of dimensionality). Semiparametric regression imputation (such as partially linear regression imputation) can reduce the dimension of the covariate in nonparametric regression fitting but is not robust against misspecification of the linear component in the regression. Assuming that the missing mechanism is covariate-dependent and that the propensity function can be specified correctly, we propose a regression imputation method that has good efficiency and is robust against regression model misspecification. Furthermore, our method is valid as long as either the regression model or the propensity model is correct, a property known as the double-robustness property. We show that asymptotically the sample mean based on our imputation achieves the semiparametric efficient lower bound if both regression and propensity models are specified correctly. Our simulation results demonstrate that the proposed method outperforms many existing methods for handling missing data, especially when the regression model is misspecified. As an illustration, an economic observational data set is analyzed.
引用
收藏
页码:797 / 810
页数:14
相关论文
共 50 条
  • [21] Doubly Robust Covariate Shift Correction
    Reddi, Sashank J.
    Poczos, Barnabas
    Smola, Alex
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2949 - 2955
  • [22] Panel kink threshold model with multiple covariate-dependent thresholds
    Yang, Lixiong
    Yao, Liangyan
    Xie, Yanli
    APPLIED ECONOMICS LETTERS, 2024,
  • [23] Improving forecasting performance using covariate-dependent copula models
    Li, Feng
    Kang, Yanfei
    INTERNATIONAL JOURNAL OF FORECASTING, 2018, 34 (03) : 456 - 476
  • [24] Threshold mixed data sampling models with a covariate-dependent threshold
    Yang, Lixiong
    Zhang, Chunli
    APPLIED ECONOMICS LETTERS, 2023, 30 (12) : 1708 - 1716
  • [25] A robust imputation method for missing responses and covariates in sample selection models
    Ogundimu, Emmanuel O.
    Collins, Gary S.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2019, 28 (01) : 102 - 116
  • [26] Doubly robust estimation under covariate-induced dependent left truncation
    Wang, Yuyao
    Ying, Andrew
    Xu, Ronghui
    BIOMETRIKA, 2024, 111 (03) : 789 - 808
  • [27] COVARIATE-DEPENDENT AGE-AT-ONSET DISTRIBUTIONS FOR HUNTINGTON DISEASE
    KRAWCZAK, M
    BOCKEL, B
    SANDKUIJL, L
    THIES, U
    FENTON, I
    HARPER, PS
    AMERICAN JOURNAL OF HUMAN GENETICS, 1991, 49 (04) : 735 - 745
  • [28] Exact confidence limits for covariate-dependent risk in cultivar trials
    Piepho, HP
    JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2000, 5 (02) : 202 - 213
  • [29] Bayesian Covariate-Dependent Gaussian Graphical Models with Varying Structure
    Ni, Yang
    Stingo, Francesco C.
    Baladandayuthapani, Veerabhadran
    Journal of Machine Learning Research, 2022, 23
  • [30] Panel kink threshold regression model with a covariate-dependent threshold
    Yang, Lixiong
    Zhang, Chunli
    Lee, Chingnun
    Chen, I-Po
    ECONOMETRICS JOURNAL, 2021, 24 (03): : 462 - 481