Imputation of missing longitudinal data: a comparison of methods

被引:319
|
作者
Engels, JM
Diehr, P
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
[2] Univ Washington, Dept Hlth Serv, Seattle, WA 98195 USA
关键词
missing data; imputation; longitudinal; depression; cohort;
D O I
10.1016/S0895-4356(03)00170-7
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background and Objective: Missing information is inevitable in longitudinal studies, and can result in biased estimates and a loss of power. One approach to this problem is to impute the missing data to yield a more complete data set. Our goal was to compare the performance of 14 methods of imputing missing data on depression, weight, cognitive functioning, and self-rated health in a longitudinal cohort of older adults. Methods: We identified situations where a person had a known value following one or more missing values, and treated the known value as a "missing value." This "missing value" was imputed using each method and compared to the observed value. Methods were compared on the root mean square error, mean absolute deviation, bias, and relative variance of the estimates. Results: Most imputation methods were biased toward estimating the "missing value" as too healthy, and most estimates had a variance that was too low. Imputed values based on a person's values before and after the "missing value" were superior to other methods, followed by imputations based on a person's values before the "missing value." Imputations that used no information specific to the person, such as using the sample mean, had the worst performance. Conclusions: We conclude that, in longitudinal studies where the overall trend is for worse health over time and where missing data can be assumed to be primarily related to worse health, missing data in a longitudinal sequence should be imputed from the available longitudinal data for that person. (C) 2003 Elsevier Inc. All rights reserved.
引用
收藏
页码:968 / 976
页数:9
相关论文
共 50 条
  • [41] Ensemble imputation methods for missing software engineering data
    Twala, B
    Cartwright, M
    [J]. 2005 11TH INTERNATIONAL SYMPOSIUM ON SOFTWARE METRICS (METRICS), 2005, : 268 - 277
  • [42] Comparison of missing value imputation methods in time series: the case of Turkish meteorological data
    Yozgatligil, Ceylan
    Aslan, Sipan
    Iyigun, Cem
    Batmaz, Inci
    [J]. THEORETICAL AND APPLIED CLIMATOLOGY, 2013, 112 (1-2) : 143 - 167
  • [43] New imputation methods for missing data using quantiles
    Munoz, J. F.
    Rueda, M.
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2009, 232 (02) : 305 - 317
  • [44] Imputation methods for missing data in educational diagnostic evaluation
    Fernandez-Alonso, Ruben
    Suarez-Alvarez, Javier
    Muniz, Jose
    [J]. PSICOTHEMA, 2012, 24 (01) : 167 - 175
  • [45] Imputation Methods for Multiple Regression with Missing Heteroscedastic Data
    Asif, Muhammad
    Samart, Klairung
    [J]. THAILAND STATISTICIAN, 2022, 20 (01): : 1 - 15
  • [46] Missing Data Imputation With Baseline Information in Longitudinal Clinical Trials
    Zhang, Yilong
    Zimmer, Zachary
    Xu, Lei
    Lam, Raymond L. H.
    Huyck, Susan
    Golm, Gregory
    [J]. STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2022, 14 (02): : 242 - 248
  • [47] Missing data imputation in longitudinal trial of endometrial cancer patients
    Rosato, Rosalba
    Pagano, Eva
    Piovano, Elisa
    Fuso, Luca
    Tripodi, Elisa
    Mitidieri, Marco
    Ceccarelli, Manuela
    Zola, Paolo
    Di Cuonzo, Daniela
    [J]. QUALITY OF LIFE RESEARCH, 2016, 25 : 60 - 61
  • [48] Some Concerns About Imputation Methods for Missing Data
    Toyomoto, Rie
    Funada, Satoshi
    Furukawa, Toshi A.
    [J]. JAMA PSYCHIATRY, 2022, 79 (03) : 270 - 270
  • [49] Evaluating Imputation Methods for Missing Data in a MCI Dataset
    Gomez-Valades Batanero, Alba
    Rincon Zamorano, Mariano
    Martinez Tomas, Rafael
    Guerrero Martin, Juan
    [J]. ARTIFICIAL INTELLIGENCE IN NEUROSCIENCE: AFFECTIVE ANALYSIS AND HEALTH APPLICATIONS, PT I, 2022, 13258 : 446 - 454
  • [50] Missing data imputation methods and their performance with biodistance analyses
    Kenyhercz, Michael W.
    Passalacqua, Nicholas V.
    [J]. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2015, 156 : 185 - 185