Multiple imputation in the presence of an incomplete binary variable created from an underlying continuous variable

被引:9
|
作者
Grobler, Anneke C. [1 ,2 ]
Lee, Katherine [1 ,2 ]
机构
[1] Murdoch Childrens Res Inst, Clin Epidemiol & Biostat Unit, Parkville, Vic, Australia
[2] Univ Melbourne, Dept Paediat, Parkville, Vic, Australia
基金
英国医学研究理事会;
关键词
binary variable; compatibility; fully conditional specification; multiple imputation; multivariate normal imputation; CHILDREN;
D O I
10.1002/bimj.201900011
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Multiple imputation (MI) is used to handle missing at random (MAR) data. Despite warnings from statisticians, continuous variables are often recoded into binary variables. With MI it is important that the imputation and analysis models are compatible; variables should be imputed in the same form they appear in the analysis model. With an encoded binary variable more accurate imputations may be obtained by imputing the underlying continuous variable. We conducted a simulation study to explore how best to impute a binary variable that was created from an underlying continuous variable. We generated a completely observed continuous outcome associated with an incomplete binary covariate that is a categorized version of an underlying continuous covariate, and an auxiliary variable associated with the underlying continuous covariate. We simulated data with several sample sizes, and set 25% and 50% of data in the covariate to MAR dependent on the outcome and the auxiliary variable. We compared the performance of five different imputation methods: (a) Imputation of the binary variable using logistic regression; (b) imputation of the continuous variable using linear regression, then categorizing into the binary variable; (c, d) imputation of both the continuous and binary variables using fully conditional specification (FCS) and multivariate normal imputation; (e) substantive-model compatible (SMC) FCS. Bias and standard errors were large when the continuous variable only was imputed. The other methods performed adequately. Imputation of both the binary and continuous variables using FCS often encountered mathematical difficulties. We recommend the SMC-FCS method as it performed best in our simulation studies.
引用
收藏
页码:467 / 478
页数:12
相关论文
共 50 条
  • [21] Variable selection under multiple imputation using the bootstrap in a prognostic study
    Martijn W Heymans
    Stef van Buuren
    Dirk L Knol
    Willem van Mechelen
    Henrica CW de Vet
    BMC Medical Research Methodology, 7
  • [23] Binary partitioning for continuous longitudinal data: categorizing a prognostic variable
    Abdolell, M
    LeBlanc, M
    Stephens, D
    Harrison, RV
    STATISTICS IN MEDICINE, 2002, 21 (22) : 3395 - 3409
  • [24] All varieties of encoding variability are not created equal: Separating variable processing from variable tasks
    Huff, Mark J.
    Bodner, Glen E.
    JOURNAL OF MEMORY AND LANGUAGE, 2014, 73 : 43 - 58
  • [25] Multiple imputation as an alternative to the analysis of non-response in the variable intention to vote
    Rivas, Cristina
    Martinez Roson, Maria del Mar
    Galindo, Purificacion
    REVISTA ESPANOLA DE CIENCIA POLITICA-RECP, 2010, (22): : 99 - 118
  • [26] Variable selection techniques after multiple imputation in high-dimensional data
    Faisal Maqbool Zahid
    Shahla Faisal
    Christian Heumann
    Statistical Methods & Applications, 2020, 29 : 553 - 580
  • [27] Variable selection techniques after multiple imputation in high-dimensional data
    Zahid, Faisal Maqbool
    Faisal, Shahla
    Heumann, Christian
    STATISTICAL METHODS AND APPLICATIONS, 2020, 29 (03): : 553 - 580
  • [28] Full Information Multiple Imputation for Linear Regression Model with Missing Response Variable
    Song, Limin
    Guo, Guangbao
    IAENG International Journal of Applied Mathematics, 2024, 54 (01) : 77 - 81
  • [29] Time Aggregation in Presence of Multiple Variable Energy Resources
    Sarajpoor, Nima
    Rakai, Logan
    Arteaga, Juan
    Amjady, Nima
    Zareipour, Hamidreza
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (01) : 587 - 601
  • [30] From incomplete penetrance to variable expressivity or vice versa
    Gilgenkrantz, Helene
    M S-MEDECINE SCIENCES, 2024, 40 (11): : 864 - 865