Rounding Strategies for Multiply Imputed Binary Data

被引:11
|
作者
Demirtas, Hakan [1 ]
机构
[1] Univ Illinois, Div Epidemiol & Biostat, Chicago, IL 60130 USA
关键词
Linear mixed-effects model; Missing data; Multiple imputation; Rounding; PATTERN-MIXTURE MODELS; MISSING-DATA; IMPUTATION; PERFORMANCE; BIAS;
D O I
10.1002/bimj.200900018
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Multiple imputation (MI) has emerged in the last two decades as a frequently used approach in dealing with incomplete data. Gaussian and log-linear imputation models are fairly straightforward to implement for continuous and discrete data, respectively. However, in missing data settings that include a mix of continuous and discrete variables, the lack of flexible models for the joint distribution of different types of variables can make the specification of the imputation model a daunting task. The widespread availability of software packages that are capable of carrying out MI under the assumption of joint multivariate normality allows applied researchers to address this complication pragmatically by treating the discrete variables as continuous for imputation purposes and subsequently rounding the imputed values to the nearest observed category. In this article, we compare several rounding rules for binary variables based on simulated longitudinal data sets that have been used to illustrate other missing-data techniques. Using a combination of conditional and marginal data generation mechanisms and imputation models, we study the statistical properties of multiple-imputation-based estimates for various population quantities under different rounding rules from bias and coverage standpoints. We conclude that a good rule should be driven by borrowing information from other variables in the system rather than relying on the marginal characteristics and should be relatively insensitive to imputation model specifications that may potentially be incompatible with the observed data. We also urge researchers to consider the applied context and specific nature of the problem, to avoid uncritical and possibly inappropriate use of rounding in imputation models.
引用
收藏
页码:677 / 688
页数:12
相关论文
共 50 条
  • [1] Power calculation in multiply imputed data
    Ruochen Zha
    Ofer Harel
    [J]. Statistical Papers, 2021, 62 : 533 - 559
  • [2] Power calculation in multiply imputed data
    Zha, Ruochen
    Harel, Ofer
    [J]. STATISTICAL PAPERS, 2021, 62 (01) : 533 - 559
  • [3] Analysis of Variance of Multiply Imputed Data
    van Ginkel, Joost R.
    Kroonenberg, Pieter M.
    [J]. MULTIVARIATE BEHAVIORAL RESEARCH, 2014, 49 (01) : 78 - 91
  • [4] Order selection tests with multiply imputed data
    Consentino, Fabrizio
    Claeskens, Gerda
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (10) : 2284 - 2295
  • [5] Small Area with Multiply Imputed Survey Data
    Runge, Marina
    Schmid, Timo
    [J]. JOURNAL OF OFFICIAL STATISTICS, 2023, 39 (04) : 507 - 533
  • [6] Differential Network Analysis with Multiply Imputed Lipidomic Data
    Kujala, Maiju
    Nevalainen, Jaakko
    Maerz, Winfried
    Laaksonen, Reijo
    Datta, Susmita
    [J]. PLOS ONE, 2015, 10 (03):
  • [7] Multiply-Imputed Synthetic Data: Advice to the Imputer
    Loong, Bronwyn
    Rubin, Donald B.
    [J]. JOURNAL OF OFFICIAL STATISTICS, 2017, 33 (04) : 1005 - 1019
  • [8] Obtaining Predictions from Models Fit to Multiply Imputed Data
    Miles, Andrew
    [J]. SOCIOLOGICAL METHODS & RESEARCH, 2016, 45 (01) : 175 - 185
  • [9] How should variable selection be performed with multiply imputed data?
    Wood, Angela M.
    White, Ian R.
    Royston, Patrick
    [J]. STATISTICS IN MEDICINE, 2008, 27 (17) : 3227 - 3246
  • [10] Multivariate outlier detection applied to multiply imputed laboratory data
    Penny, KI
    Jolliffe, IT
    [J]. STATISTICS IN MEDICINE, 1999, 18 (14) : 1879 - 1895