Multiple imputation in veterinary epidemiological studies: a case study and simulation

被引:11
|
作者
Dohoo, Ian R. [1 ]
Nielsen, Christel R. [2 ]
Emanuelson, Ulf [3 ]
机构
[1] Univ Prince Edward Isl, Atlantic Vet Coll, Dept Hlth Management, Charlottetown, PE C1A 4P3, Canada
[2] Skane Univ Hosp, R&D Ctr Skane, Unit Med Stat & Epidemiol, Lund, Sweden
[3] Swedish Univ Agr Sci, Dept Clin Sci, SE-75007 Uppsala, Sweden
关键词
Multiple imputation; Questionnaire; Dependent variable; Simulation; MCAR; MAR; NMAR; MISSING DATA; OUTCOME DATA; VALUES; HEALTH;
D O I
10.1016/j.prevetmed.2016.04.003
中图分类号
S85 [动物医学(兽医学)];
学科分类号
0906 ;
摘要
The problem of missing data occurs frequently in veterinary epidemiological studies. Most studies use a complete case (CC) analysis which excludes all observations for which any relevant variable have missing values. Alternative approaches (most notably multiple imputation (MI)) which avoid the exclusion of observations with missing values are now widely available but have been used very little in veterinary epidemiology. This paper uses a case study based on research into dairy producers' attitudes toward mastitis control procedures, combined with two simulation studies to evaluate the use of MI and compare results with a CC analysis. MI analysis of the original data produced results which had relatively minor differences from the CC analysis. However, most of the missing data in the original data set were in the dependent variable and a subsequent simulation study based on the observed missing data pattern and 1000 simulations showed that an MI analysis would not be expected to offer any advantages over a CC analysis in this situation. This was true regardless of the missing data mechanism (MCAR - missing completely at random, MAR - missing at random, or NMAR - not missing at random) underlying the missing values. Surprisingly, recent textbooks dealing with MI make little reference to this limitation of MI for dealing with missing values in the dependent variable. An additional simulation study (1000 runs for each of the three missing data mechanisms) compared MI and CC analyses for data in which varying levels (n = 7) of missing data were created in predictor variables. This study showed that MI analyses generally produced results that were less biased on average, were more precise (smaller SEs), were more consistent (less variability between simulation runs) and consequently were more likely to produce estimates that were close to the "truth" (results obtained from a data set with no missing values). While the benefit of MI varied with the mechanism used to generate the missing data, MI always performed as well as, or better than; CC analysis. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:35 / 47
页数:13
相关论文
共 50 条
  • [21] Strategies for Multiple Imputation in Longitudinal Studies
    Spratt, Michael
    Carpenter, James
    Sterne, Jonathan A. C.
    Carlin, John B.
    Heron, Jon
    Henderson, John
    Tilling, Kate
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 172 (04) : 478 - 487
  • [22] Epidemiological trends for functional pancreatic neuroendocrine tumors: A study combining multiple imputation with age adjustment
    Luo, Shuaiwu
    Wang, Jiakun
    Wu, Linquan
    Wang, Cong
    Yang, Jun
    Li, Min
    Zhang, Ligan
    Ge, Jin
    Sun, Chi
    Li, Enliang
    Lei, Jun
    Liao, Yuting
    Zhou, Fan
    Liao, Wenjun
    [J]. FRONTIERS IN ENDOCRINOLOGY, 2023, 14
  • [23] Use of multiple imputation in supersampled nested case-control and case-cohort studies
    Borgan, Ornulf
    Keogh, Ruth H.
    Njos, Aleksander
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2023, 50 (01) : 13 - 37
  • [24] Multiple imputation of missing data in nested case-control and case-cohort studies
    Keogh, Ruth H.
    Seaman, Shaun R.
    Bartlett, Jonathan W.
    Wood, Angela M.
    [J]. BIOMETRICS, 2018, 74 (04) : 1438 - 1449
  • [25] A Simulation Study Comparing Multiple Imputation Methods for Incomplete Longitudinal Ordinal Data
    Donneau, A. F.
    Mauer, M.
    Molenberghs, G.
    Albert, A.
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2015, 44 (05) : 1311 - 1338
  • [26] The multiple imputation method: a case study involving secondary data analysis
    Walani, Salimah R.
    Cleland, Charles M.
    [J]. NURSE RESEARCHER, 2015, 22 (05) : 13 - 19
  • [27] Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls
    Sterne, Jonathan A. C.
    White, Ian R.
    Carlin, John B.
    Spratt, Michael
    Royston, Patrick
    Kenward, Michael G.
    Wood, Angela M.
    Carpenter, James R.
    [J]. BMJ-BRITISH MEDICAL JOURNAL, 2009, 339 : 157 - 160
  • [28] Assessment of PCXMC for Epidemiological Studies: A Monte Carlo Simulation Study
    Borrego, D.
    Lee, C.
    [J]. MEDICAL PHYSICS, 2017, 44 (06) : 2983 - 2983
  • [29] Multiple Imputation for Incomplete Data in Epidemiologic Studies
    Harel, Ofer
    Mitchell, Emily M.
    Perkins, Neil J.
    Cole, Stephen R.
    Tchetgen, Eric J. Tchetgen
    Sun, BaoLuo
    Schisterman, Enrique F.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2018, 187 (03) : 576 - 584
  • [30] Multiple Imputation of Composite Covariates in Survival Studies
    Clements, Lily
    Kimber, Alan
    Biedermann, Stefanie
    [J]. STATS, 2022, 5 (02): : 358 - 370