Multiple imputation of binary multilevel missing not at random data

被引:7
|
作者
Hammon, Angelina [1 ,2 ,3 ]
Zinn, Sabine [1 ,3 ]
机构
[1] DIW Berlin, Mohrenstr 58, D-10117 Berlin, Germany
[2] Univ Bamberg, Bamberg, Germany
[3] LIfBi, Bamberg, Germany
关键词
Fully conditional specification; Missingness not at random; Multilevel data; Multiple imputation; Selection model; CHAINED EQUATIONS; MODEL; LIKELIHOOD; REGRESSION; INFERENCE; SMOKING;
D O I
10.1111/rssc.12401
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We introduce a selection model-based multilevel imputation approach to be used within the fully conditional specification framework for multiple imputation. Concretely, we apply a censored bivariate probit model to describe binary variables assumed to be missing not at random. The first equation of the model defines the regression model for the missing data mechanism. The second equation specifies the regression model of the variable to be imputed. The non-random selection of the binary data is mapped by correlations between the error terms of the two regression models. Hierarchical data structures are modelled by random intercepts in both equations. To fit the novel imputation model we use maximum likelihood and adaptive Gauss-Hermite quadrature. A comprehensive simulation study shows the overall performance of the approach. We test its usefulness for empirical research by applying it to a common problem in social scientific research: the emergence of educational aspirations. Our software is designed to be used in the R package mice.
引用
收藏
页码:547 / 564
页数:18
相关论文
共 50 条
  • [31] Multiple imputation for missing data - A cautionary tale
    Allison, PD
    [J]. SOCIOLOGICAL METHODS & RESEARCH, 2000, 28 (03) : 301 - 309
  • [32] Introduction to multiple imputation for dealing with missing data
    Lee, Katherine J.
    Simpson, Julie A.
    [J]. RESPIROLOGY, 2014, 19 (02) : 162 - 167
  • [33] The use of multiple imputation for the analysis of missing data
    Sinharay, S
    Stern, HS
    Russell, D
    [J]. PSYCHOLOGICAL METHODS, 2001, 6 (04) : 317 - 329
  • [34] MULTIPLE IMPUTATION FOR CATEGORICAL VARIABLES IN MULTILEVEL DATA
    Kottage, Helani Dilshara
    [J]. BULLETIN OF THE AUSTRALIAN MATHEMATICAL SOCIETY, 2022, 106 (02) : 349 - 350
  • [35] Regression multiple imputation for missing data analysis
    Yu, Lili
    Liu, Liang
    Peace, Karl E.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (09) : 2647 - 2664
  • [36] Siamese Autoencoder Architecture for the Imputation of Data Missing Not at Random
    Pereira, Ricardo Cardoso
    Abreu, Pedro Henriques
    Rodrigues, Pedro Pereira
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 78
  • [37] Identifiable Generative Models for Missing Not at Random Data Imputation
    Ma, Chao
    Zhang, Cheng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [38] Deep Generative Imputation Model for Missing Not At Random Data
    Chen, Jialei
    Xu, Yuanbo
    Wang, Pengyang
    Yang, Yongjian
    [J]. PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 316 - 325
  • [39] Analysis of binary outcomes with missing data: missing = smoking, last observation carried forward, and a little multiple imputation
    Hedeker, Donald
    Mermelstein, Robin J.
    Demirtas, Hakan
    [J]. ADDICTION, 2007, 102 (10) : 1564 - 1573
  • [40] Efficient random imputation for missing data in complex surveys
    Chen, J
    Rao, JNK
    Sitter, RR
    [J]. STATISTICA SINICA, 2000, 10 (04) : 1153 - 1169