Recovery of information from multiple imputation: a simulation study

被引:0
|
作者
Lee, Katherine J. [1 ,2 ]
Carlin, John B. [1 ,2 ]
机构
[1] Royal Childrens Hosp, Murdoch Childrens Res Inst, Clin Epidemiol & Biostat Unit, Flemington Rd, Parkville, Vic 3052, Australia
[2] Univ Melbourne, Dept Paediat, Melbourne, Vic 3010, Australia
来源
基金
英国医学研究理事会;
关键词
Missing data; Multiple imputation; Fully conditional specification; Multivariate normal imputation; Non-normal data;
D O I
10.1186/1742-7622-9-3
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: Multiple imputation is becoming increasingly popular for handling missing data. However, it is often implemented without adequate consideration of whether it offers any advantage over complete case analysis for the research question of interest, or whether potential gains may be offset by bias from a poorly fitting imputation model, particularly as the amount of missing data increases. Methods: Simulated datasets (n = 1000) drawn from a synthetic population were used to explore information recovery from multiple imputation in estimating the coefficient of a binary exposure variable when various proportions of data (10-90%) were set missing at random in a highly-skewed continuous covariate or in the binary exposure. Imputation was performed using multivariate normal imputation (MVNI), with a simple or zero-skewness log transformation to manage non-normality. Bias, precision, mean-squared error and coverage for a set of regression parameter estimates were compared between multiple imputation and complete case analyses. Results: For missingness in the continuous covariate, multiple imputation produced less bias and greater precision for the effect of the binary exposure variable, compared with complete case analysis, with larger gains in precision with more missing data. However, even with only moderate missingness, large bias and substantial under-coverage were apparent in estimating the continuous covariate's effect when skewness was not adequately addressed. For missingness in the binary covariate, all estimates had negligible bias but gains in precision from multiple imputation were minimal, particularly for the coefficient of the binary exposure. Conclusions: Although multiple imputation can be useful if covariates required for confounding adjustment are missing, benefits are likely to be minimal when data are missing in the exposure variable of interest. Furthermore, when there are large amounts of missingness, multiple imputation can become unreliable and introduce bias not present in a complete case analysis if the imputation model is not appropriate. Epidemiologists dealing with missing data should keep in mind the potential limitations as well as the potential benefits of multiple imputation. Further work is needed to provide clearer guidelines on effective application of this method.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Recovery of information from multiple imputation: a simulation study
    Katherine J Lee
    John B Carlin
    [J]. Emerging Themes in Epidemiology, 9 (1):
  • [2] Outcome-sensitive multiple imputation: a simulation study
    Evangelos Kontopantelis
    Ian R. White
    Matthew Sperrin
    Iain Buchan
    [J]. BMC Medical Research Methodology, 17
  • [3] Outcome-sensitive multiple imputation: a simulation study
    Kontopantelis, Evangelos
    White, Ian R.
    Sperrin, Matthew
    Buchan, Iain
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2017, 17
  • [4] Multiple imputation in veterinary epidemiological studies: a case study and simulation
    Dohoo, Ian R.
    Nielsen, Christel R.
    Emanuelson, Ulf
    [J]. PREVENTIVE VETERINARY MEDICINE, 2016, 129 : 35 - 47
  • [5] Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study
    Matthew Sperrin
    Glen P. Martin
    [J]. BMC Medical Research Methodology, 20
  • [6] Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study
    Sperrin, Matthew
    Martin, Glen P.
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2020, 20 (01)
  • [7] A multiple imputation method using population information
    Fushiki, Tadayoshi
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024,
  • [8] Information leakage and recovery from multiple LCDs
    Choi, Dong Hoon
    Lee, Ho Seong
    Yook, Jong-Gwan
    [J]. 2018 JOINT IEEE INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC COMPATIBILITY AND 2018 IEEE ASIA-PACIFIC SYMPOSIUM ON ELECTROMAGNETIC COMPATIBILITY (EMC/APEMC), 2018, : 1053 - 1055
  • [9] Multiple Imputation in the Context of Case-Cohort Studies: Simulation and Case Study
    Middleton, Melissa
    Moreno-Betancur, Margarita
    Carlin, John
    Lee, Katherine J.
    [J]. INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2021, 50 : 155 - 155
  • [10] A Simulation Study Comparing Multiple Imputation Methods for Incomplete Longitudinal Ordinal Data
    Donneau, A. F.
    Mauer, M.
    Molenberghs, G.
    Albert, A.
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2015, 44 (05) : 1311 - 1338