A comparison of imputation techniques for handling missing data

被引:174
|
作者
Musil, CM [1 ]
Warner, CB
Yobas, PK
Jones, SL
机构
[1] Case Western Reserve Univ, Frances Payne Bolton Sch Nursing, Dept Sociol, Cleveland, OH 44106 USA
[2] Mahidol Univ, Fac Nursing, Dept Mental Hlth & Psychiat Nursing, Bangkok 10700, Thailand
[3] Kent State Univ, Coll Nursing, Kent, OH 44242 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1177/019394502762477004
中图分类号
R47 [护理学];
学科分类号
1011 ;
摘要
Researchers are commonly faced with the problem of missing data. This article presents theoretical and empirical information for the selection and application of approaches for handling missing data on a single variable. An actual data set of 492 cases with no missing values was used to create a simulated yet realistic data set with missing at random (MAR) data. The authors compare and contrast five approaches (listwise. deletion, mean substitution, simple regression, regression with an error term, and the expectation maximization [EM] algorithm) for dealing with missing data, and compare the effects of each method on descriptive statistics and correlation coefficients for the imputed data (n = 96) and the entire sample (n = 492) when imputed data are included. All methods had limitations, although our findings suggest that mean substitution was the least effective and that regression with an error term and the EM algorithm produced estimates closest to those of the original variables.
引用
收藏
页码:815 / 829
页数:15
相关论文
共 50 条
  • [1] Some Classes of Logarithmic-Type Imputation Techniques for Handling Missing Data
    Pandey, Awadhesh K.
    Singh, G. N.
    Bhattacharyya, D.
    Ali, Abdulrazzaq Q.
    Al-Thubaiti, Samah
    Yakout, H. A.
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [2] A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome
    Ambler, Gareth
    Omar, Rumana Z.
    Royston, Patrick
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2007, 16 (03) : 277 - 298
  • [3] A Comparison of Missing-Data Imputation Techniques in Exploratory Factor Analysis
    Xiao, Canhua
    Bruner, Deborah W.
    Dai, Tian
    Guo, Ying
    Hanlon, Alexandra
    [J]. JOURNAL OF NURSING MEASUREMENT, 2019, 27 (02) : 313 - 334
  • [4] Handling missing data in nursing research with multiple imputation
    Kneipp, SM
    McIntosh, M
    [J]. NURSING RESEARCH, 2001, 50 (06) : 384 - 389
  • [5] Imputation is beneficial for handling missing data in predictive models
    Steyerberg, Ewout W.
    van Veen, Mirjam
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2007, 60 (09) : 979 - 979
  • [6] Multiple Imputation A Flexible Tool for Handling Missing Data
    Li, Peng
    Stuart, Elizabeth A.
    Allison, David B.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2015, 314 (18): : 1966 - 1967
  • [7] A comparison of imputation methods for handling missing scores in biometric fusion
    Ding, Yaohui
    Ross, Arun
    [J]. PATTERN RECOGNITION, 2012, 45 (03) : 919 - 933
  • [8] Handling Missing Values in Longitudinal Panel Data With Multiple Imputation
    Young, Rebekah
    Johnson, David R.
    [J]. JOURNAL OF MARRIAGE AND FAMILY, 2015, 77 (01) : 277 - 294
  • [9] Imputation Methods for Handling Missing Dietary Supplement Dosage Data
    Leung, June
    Dwyer, Johanna
    Hibberd, Patricia
    Jacques, Paul
    Rand, William
    [J]. JOURNAL OF RENAL NUTRITION, 2010, 20 (05) : 342 - 347
  • [10] Handling missing data in trees: Surrogate splits or statistical imputation?
    Feelders, A
    [J]. PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1999, 1704 : 329 - 334