Multiple imputation for incomplete data with semicontinuous variables

被引:15
|
作者
Javaras, KN [1 ]
Van Dyk, DA
机构
[1] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
[2] Univ Calif Irvine, Dept Stat, Irvine, CA 92697 USA
关键词
data augmentation; EM algorithm; general location model; missing data; survey data;
D O I
10.1198/016214503000000611
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the application of multiple imputation to data containing not only partially missing categorical and continuous variables, but also partially missing 'semicontinuous' variables (variables that take on a single discrete value with positive probability but are otherwise continuously distributed). As an imputation model for data sets of this type, we introduce an extension of the standard general location model proposed by Olkin and Tate; our extension, the blocked general location model, provides a robust and general strategy for handling partially observed semicontinuous variables. In particular, we incorporate a two-level model for the semicontinuous variables into the general location model. The first level models the probability that the semicontinuous variable takes on its point mass value, and the second level models the distribution of the variable given that it is not at its point mass. In addition, we introduce EM and data augmentation algorithms for the blocked general location model with missing data; these can be used to generate imputations under the proposed model and have been implemented in publicly available software. We illustrate our model and computational methods via a simulation study and an analysis of a survey of Massachusetts Megabucks Lottery winners.
引用
收藏
页码:703 / 715
页数:13
相关论文
共 50 条
  • [31] Analyzing incomplete political science data: An alternative algorithm for multiple imputation
    King, G
    Honaker, J
    Joseph, A
    Scheve, K
    [J]. AMERICAN POLITICAL SCIENCE REVIEW, 2001, 95 (01) : 49 - 69
  • [32] Auxiliary Variables in Multiple Imputation When Data Are Missing Not at Random
    Mustillo, Sarah
    Kwon, Soyoung
    [J]. JOURNAL OF MATHEMATICAL SOCIOLOGY, 2015, 39 (02): : 73 - 91
  • [33] Multiple imputation for an incomplete covariate that is a ratio
    Morris, Tim P.
    White, Ian R.
    Royston, Patrick
    Seaman, Shaun R.
    Wood, Angela M.
    [J]. STATISTICS IN MEDICINE, 2014, 33 (01) : 88 - 104
  • [34] Multiple Imputation for Bounded Variables
    Marco Geraci
    Alexander McLain
    [J]. Psychometrika, 2018, 83 : 919 - 940
  • [35] Multiple Imputation for Bounded Variables
    Geraci, Marco
    McLain, Alexander
    [J]. PSYCHOMETRIKA, 2018, 83 (04) : 919 - 940
  • [36] Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys
    Si, Yajuan
    Reiter, Jerome P.
    [J]. JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2013, 38 (05) : 499 - 521
  • [37] A Hybrid Method for Incomplete Data Imputation
    Zhao, Liang
    Chen, Zhikui
    Yang, Zhennan
    Hu, Yueming
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1725 - 1730
  • [38] Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation
    Simone Wahl
    Anne-Laure Boulesteix
    Astrid Zierer
    Barbara Thorand
    Mark A. van de Wiel
    [J]. BMC Medical Research Methodology, 16
  • [39] Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation
    Wahl, Simone
    Boulesteix, Anne-Laure
    Zierer, Astrid
    Thorand, Barbara
    de Wiel, Mark Avan
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2016, 16 : 1 - 18
  • [40] Special issue: Incomplete data: multiple imputation and model-based analysis
    van Buuren, S
    Eisinga, R
    [J]. STATISTICA NEERLANDICA, 2003, 57 (01) : 1 - 2