Multiple imputation of incomplete multilevel data using Heckman selection models

被引：1

作者：

Munoz, Johanna ^{[1
,7
]}

Efthimiou, Orestis ^{[2
,3
]}

Audigier, Vincent ^{[4
]}

de Jong, Valentijn M. T. ^{[1
,5
]}

Debray, Thomas P. A. ^{[1
,6
]}

机构：

[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands

[2] Univ Bern, Inst Primary Hlth Care BIHAM, Bern, Switzerland

[3] Univ Bern, Inst Social & Prevent Med ISPM, Bern, Switzerland

[4] Lab CEDR MSDMA, Conservatoire Natl Arts & Metiers CNAM, Paris, France

[5] European Med Agcy, Data Analyt & Methods Task Force, Amsterdam, Netherlands

[6] Smart Data Anal & Stat, Utrecht, Netherlands

[7] UMC Utrecht, Julius Ctr Hlth Sci & Primary Care, Str 6-131,POB 85500, NL-3508GA Utrecht, Netherlands

来源：

STATISTICS IN MEDICINE | 2024年 / 43卷 / 03期

基金：

欧盟地平线“2020”;

关键词：

Heckman model; IPDMA; missing not at random; selection models; multiple imputation; SAMPLE SELECTION; VARIABLES; BIAS;

D O I：

10.1002/sim.9965

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Missing data is a common problem in medical research, and is commonly addressed using multiple imputation. Although traditional imputation methods allow for valid statistical inference when data are missing at random (MAR), their implementation is problematic when the presence of missingness depends on unobserved variables, that is, the data are missing not at random (MNAR). Unfortunately, this MNAR situation is rather common, in observational studies, registries and other sources of real-world data. While several imputation methods have been proposed for addressing individual studies when data are MNAR, their application and validity in large datasets with multilevel structure remains unclear. We therefore explored the consequence of MNAR data in hierarchical data in-depth, and proposed a novel multilevel imputation method for common missing patterns in clustered datasets. This method is based on the principles of Heckman selection models and adopts a two-stage meta-analysis approach to impute binary and continuous variables that may be outcomes or predictors and that are systematically or sporadically missing. After evaluating the proposed imputation model in simulated scenarios, we illustrate it use in a cross-sectional community survey to estimate the prevalence of malaria parasitemia in children aged 2-10 years in five regions in Uganda.

引用

页码：514 / 533

页数：20

共 50 条

[31] Handling Incomplete Data Using Evolution of Imputation Methods
Zawistowski, Pawel
Grzenda, Maciej
[J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 22 - +
[32] The Performance of Multilevel Models When Outcome Data Are Incomplete
Chang, Wanchen
Pituch, Keenan A.
[J]. JOURNAL OF EXPERIMENTAL EDUCATION, 2019, 87 (01): : 1 - 16
[33] Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach
Simon Grund
Oliver Lüdtke
Alexander Robitzsch
[J]. Behavior Research Methods, 2021, 53 : 2631 - 2649
[34] Multiple imputation of missing data in multilevel models with the R package mdmb: a flexible sequential modeling approach
Grund, Simon
Luedtke, Oliver
Robitzsch, Alexander
[J]. BEHAVIOR RESEARCH METHODS, 2021, 53 (06) : 2631 - 2649
[35] Imputation Methods for Incomplete Data
Umathe, Vaishali H.
Chaudhary, Gauri
[J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
[36] Cost-effectiveness in clinical trials: using multiple imputation to deal with incomplete cost data
Burton, Andrea
Billingham, Lucinda Jane
Bryan, Stirling
[J]. CLINICAL TRIALS, 2007, 4 (02) : 154 - 161
[37] Difference Between Binomial Proportions Using Newcombe's Method With Multiple Imputation for Incomplete Data
Sidi, Yulia
Harel, Ofer
[J]. AMERICAN STATISTICIAN, 2022, 76 (01): : 29 - 36
[38] Multiple imputation combined with bootstrapping for analysing incomplete cost and effect data
Heymans, M. W.
De Bruyne, M. C.
Van Buuren, S.
[J]. EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2006, 21 : 57 - 57
[39] Analyzing incomplete political science data: An alternative algorithm for multiple imputation
King, G
Honaker, J
Joseph, A
Scheve, K
[J]. AMERICAN POLITICAL SCIENCE REVIEW, 2001, 95 (01) : 49 - 69
[40] Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud
Bu, Fanyu
Chen, Zhikui
Zhang, Qingchen
Yang, Laurence T.
[J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (08): : 2977 - 2990

← 1 2 3 4 5 →