Multiple imputation of incomplete multilevel data using Heckman selection models

被引:1
|
作者
Munoz, Johanna [1 ,7 ]
Efthimiou, Orestis [2 ,3 ]
Audigier, Vincent [4 ]
de Jong, Valentijn M. T. [1 ,5 ]
Debray, Thomas P. A. [1 ,6 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[2] Univ Bern, Inst Primary Hlth Care BIHAM, Bern, Switzerland
[3] Univ Bern, Inst Social & Prevent Med ISPM, Bern, Switzerland
[4] Lab CEDR MSDMA, Conservatoire Natl Arts & Metiers CNAM, Paris, France
[5] European Med Agcy, Data Analyt & Methods Task Force, Amsterdam, Netherlands
[6] Smart Data Anal & Stat, Utrecht, Netherlands
[7] UMC Utrecht, Julius Ctr Hlth Sci & Primary Care, Str 6-131,POB 85500, NL-3508GA Utrecht, Netherlands
基金
欧盟地平线“2020”;
关键词
Heckman model; IPDMA; missing not at random; selection models; multiple imputation; SAMPLE SELECTION; VARIABLES; BIAS;
D O I
10.1002/sim.9965
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Missing data is a common problem in medical research, and is commonly addressed using multiple imputation. Although traditional imputation methods allow for valid statistical inference when data are missing at random (MAR), their implementation is problematic when the presence of missingness depends on unobserved variables, that is, the data are missing not at random (MNAR). Unfortunately, this MNAR situation is rather common, in observational studies, registries and other sources of real-world data. While several imputation methods have been proposed for addressing individual studies when data are MNAR, their application and validity in large datasets with multilevel structure remains unclear. We therefore explored the consequence of MNAR data in hierarchical data in-depth, and proposed a novel multilevel imputation method for common missing patterns in clustered datasets. This method is based on the principles of Heckman selection models and adopts a two-stage meta-analysis approach to impute binary and continuous variables that may be outcomes or predictors and that are systematically or sporadically missing. After evaluating the proposed imputation model in simulated scenarios, we illustrate it use in a cross-sectional community survey to estimate the prevalence of malaria parasitemia in children aged 2-10 years in five regions in Uganda.
引用
收藏
页码:514 / 533
页数:20
相关论文
共 50 条
  • [31] Handling Incomplete Data Using Evolution of Imputation Methods
    Zawistowski, Pawel
    Grzenda, Maciej
    [J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 22 - +
  • [32] The Performance of Multilevel Models When Outcome Data Are Incomplete
    Chang, Wanchen
    Pituch, Keenan A.
    [J]. JOURNAL OF EXPERIMENTAL EDUCATION, 2019, 87 (01): : 1 - 16
  • [33] Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach
    Simon Grund
    Oliver Lüdtke
    Alexander Robitzsch
    [J]. Behavior Research Methods, 2021, 53 : 2631 - 2649
  • [34] Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach
    Grund, Simon
    Luedtke, Oliver
    Robitzsch, Alexander
    [J]. BEHAVIOR RESEARCH METHODS, 2021, 53 (06) : 2631 - 2649
  • [35] Imputation Methods for Incomplete Data
    Umathe, Vaishali H.
    Chaudhary, Gauri
    [J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
  • [36] Cost-effectiveness in clinical trials: using multiple imputation to deal with incomplete cost data
    Burton, Andrea
    Billingham, Lucinda Jane
    Bryan, Stirling
    [J]. CLINICAL TRIALS, 2007, 4 (02) : 154 - 161
  • [37] Difference Between Binomial Proportions Using Newcombe's Method With Multiple Imputation for Incomplete Data
    Sidi, Yulia
    Harel, Ofer
    [J]. AMERICAN STATISTICIAN, 2022, 76 (01): : 29 - 36
  • [38] Multiple imputation combined with bootstrapping for analysing incomplete cost and effect data
    Heymans, M. W.
    De Bruyne, M. C.
    Van Buuren, S.
    [J]. EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2006, 21 : 57 - 57
  • [39] Analyzing incomplete political science data: An alternative algorithm for multiple imputation
    King, G
    Honaker, J
    Joseph, A
    Scheve, K
    [J]. AMERICAN POLITICAL SCIENCE REVIEW, 2001, 95 (01) : 49 - 69
  • [40] Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud
    Bu, Fanyu
    Chen, Zhikui
    Zhang, Qingchen
    Yang, Laurence T.
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (08): : 2977 - 2990