Multiple imputation of incomplete multilevel data using Heckman selection models

被引:1
|
作者
Munoz, Johanna [1 ,7 ]
Efthimiou, Orestis [2 ,3 ]
Audigier, Vincent [4 ]
de Jong, Valentijn M. T. [1 ,5 ]
Debray, Thomas P. A. [1 ,6 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[2] Univ Bern, Inst Primary Hlth Care BIHAM, Bern, Switzerland
[3] Univ Bern, Inst Social & Prevent Med ISPM, Bern, Switzerland
[4] Lab CEDR MSDMA, Conservatoire Natl Arts & Metiers CNAM, Paris, France
[5] European Med Agcy, Data Analyt & Methods Task Force, Amsterdam, Netherlands
[6] Smart Data Anal & Stat, Utrecht, Netherlands
[7] UMC Utrecht, Julius Ctr Hlth Sci & Primary Care, Str 6-131,POB 85500, NL-3508GA Utrecht, Netherlands
基金
欧盟地平线“2020”;
关键词
Heckman model; IPDMA; missing not at random; selection models; multiple imputation; SAMPLE SELECTION; VARIABLES; BIAS;
D O I
10.1002/sim.9965
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Missing data is a common problem in medical research, and is commonly addressed using multiple imputation. Although traditional imputation methods allow for valid statistical inference when data are missing at random (MAR), their implementation is problematic when the presence of missingness depends on unobserved variables, that is, the data are missing not at random (MNAR). Unfortunately, this MNAR situation is rather common, in observational studies, registries and other sources of real-world data. While several imputation methods have been proposed for addressing individual studies when data are MNAR, their application and validity in large datasets with multilevel structure remains unclear. We therefore explored the consequence of MNAR data in hierarchical data in-depth, and proposed a novel multilevel imputation method for common missing patterns in clustered datasets. This method is based on the principles of Heckman selection models and adopts a two-stage meta-analysis approach to impute binary and continuous variables that may be outcomes or predictors and that are systematically or sporadically missing. After evaluating the proposed imputation model in simulated scenarios, we illustrate it use in a cross-sectional community survey to estimate the prevalence of malaria parasitemia in children aged 2-10 years in five regions in Uganda.
引用
收藏
页码:514 / 533
页数:20
相关论文
共 50 条
  • [1] Multiple Imputation of Missing Data for Multilevel Models: Simulations and Recommendations
    Grund, Simon
    Luedtke, Oliver
    Robitzsch, Alexander
    [J]. ORGANIZATIONAL RESEARCH METHODS, 2018, 21 (01) : 111 - 149
  • [2] Imputation of Incomplete Motion Data Using Hidden Markov Models
    Uvarov, V. E.
    Popov, A. A.
    Gultyaeva, T. A.
    [J]. XII INTERNATIONAL SCIENTIFIC AND TECHNICAL CONFERENCE APPLIED MECHANICS AND SYSTEMS DYNAMICS, 2019, 1210
  • [3] On using multiple imputation for exploratory factor analysis of incomplete data
    Nassiri, Vahid
    Lovik, Aniko
    Molenberghs, Geert
    Verbeke, Geert
    [J]. BEHAVIOR RESEARCH METHODS, 2018, 50 (02) : 501 - 517
  • [4] Using multiple imputation for analysis of incomplete data in clinical research
    McCleary, L
    [J]. NURSING RESEARCH, 2002, 51 (05) : 339 - 343
  • [5] On using multiple imputation for exploratory factor analysis of incomplete data
    Vahid Nassiri
    Anikó Lovik
    Geert Molenberghs
    Geert Verbeke
    [J]. Behavior Research Methods, 2018, 50 : 501 - 517
  • [6] Analysis of incomplete longitudinal binary data using multiple imputation
    Li, Xiaoming
    Mehrotra, Devan V.
    Barnard, John
    [J]. STATISTICS IN MEDICINE, 2006, 25 (12) : 2107 - 2124
  • [7] Imputation of Housing Rents for Owners Using Models With Heckman Correction
    Hulliger, Beat
    Wiegand, Gordon
    [J]. SURVEY RESEARCH METHODS, 2012, 6 (02): : 95 - 103
  • [8] A Classification Method for Incomplete Mixed Data Using Imputation and Feature Selection
    Li, Gengsong
    Zheng, Qibin
    Liu, Yi
    Li, Xiang
    Qin, Wei
    Diao, Xingchun
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [9] A multiple imputation strategy for incomplete longitudinal data
    Landrum, MB
    Becker, MP
    [J]. STATISTICS IN MEDICINE, 2001, 20 (17-18) : 2741 - 2760
  • [10] Multiple imputation for incomplete data with semicontinuous variables
    Javaras, KN
    Van Dyk, DA
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (463) : 703 - 715