A comparison of imputation techniques for handling missing data

被引:174
|
作者
Musil, CM [1 ]
Warner, CB
Yobas, PK
Jones, SL
机构
[1] Case Western Reserve Univ, Frances Payne Bolton Sch Nursing, Dept Sociol, Cleveland, OH 44106 USA
[2] Mahidol Univ, Fac Nursing, Dept Mental Hlth & Psychiat Nursing, Bangkok 10700, Thailand
[3] Kent State Univ, Coll Nursing, Kent, OH 44242 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1177/019394502762477004
中图分类号
R47 [护理学];
学科分类号
1011 ;
摘要
Researchers are commonly faced with the problem of missing data. This article presents theoretical and empirical information for the selection and application of approaches for handling missing data on a single variable. An actual data set of 492 cases with no missing values was used to create a simulated yet realistic data set with missing at random (MAR) data. The authors compare and contrast five approaches (listwise. deletion, mean substitution, simple regression, regression with an error term, and the expectation maximization [EM] algorithm) for dealing with missing data, and compare the effects of each method on descriptive statistics and correlation coefficients for the imputed data (n = 96) and the entire sample (n = 492) when imputed data are included. All methods had limitations, although our findings suggest that mean substitution was the least effective and that regression with an error term and the EM algorithm produced estimates closest to those of the original variables.
引用
下载
收藏
页码:815 / 829
页数:15
相关论文
共 50 条
  • [21] Handling Bad or Missing Smart Meter Data through Advanced Data Imputation
    Peppanen, Jouni
    Zhang, Xiaochen
    Grijalva, Santiago
    Reno, Matthew J.
    2016 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE (ISGT), 2016,
  • [22] Handling missing data: analysis of a challenging data set using multiple imputation
    Pampaka, Maria
    Hutcheson, Graeme
    Williams, Julian
    INTERNATIONAL JOURNAL OF RESEARCH & METHOD IN EDUCATION, 2016, 39 (01) : 19 - 37
  • [23] Handling missing data for the identification of charged particles in a multilayer detector: A comparison between different imputation methods
    Riggi, S.
    Riggi, D.
    Riggi, F.
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2015, 780 : 81 - 90
  • [24] Randomization tests in clinical trials with multiple imputation for handling missing data
    Ivanova, Anastasia
    Lederman, Seth
    Stark, Philip B.
    Sullivan, Gregory
    Vaughn, Ben
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2022, 32 (03) : 441 - 449
  • [25] Handling Missing Data in Presence of Categorical Variables: a New Imputation Procedure
    Ferrari, Pier Alda
    Barbiero, Alessandro
    Manzi, Giancarlo
    NEW PERSPECTIVES IN STATISTICAL MODELING AND DATA ANALYSIS, 2011, : 473 - 480
  • [26] Multiple imputation as a flexible tool for missing data handling in clinical research
    Enders, Craig K.
    BEHAVIOUR RESEARCH AND THERAPY, 2017, 98 : 4 - 18
  • [27] Considerations of multiple imputation approaches for handling missing data in clinical trials
    Quan, Hui
    Qi, Li
    Luo, Xiaodong
    Darchy, Loic
    CONTEMPORARY CLINICAL TRIALS, 2018, 70 : 62 - 71
  • [28] Handling missing data in an FFQ: multiple imputation and nutrient intake estimates
    Ichikawa, Mari
    Hosono, Akihiro
    Tamai, Yuya
    Watanabe, Miki
    Shibata, Kiyoshi
    Tsujimura, Shoko
    Oka, Kyoko
    Fujita, Hitomi
    Okamoto, Naoko
    Kamiya, Mayumi
    Kondo, Fumi
    Wakabayashi, Ryozo
    Noguchi, Taiji
    Isomura, Tatsuya
    Imaeda, Nahomi
    Goto, Chiho
    Yamada, Tamaki
    Suzuki, Sadao
    PUBLIC HEALTH NUTRITION, 2019, 22 (08) : 1351 - 1360
  • [29] A Comparative Study on Imputation Techniques: Introducing a Transformer Model for Robust and Efficient Handling of Missing EEG Amplitude Data
    Khan, Murad Ali
    BIOENGINEERING-BASEL, 2024, 11 (08):
  • [30] Comparison of missing data imputation methods using weather data
    Nida, Hafiza
    Kashif, Muhammad
    Khan, Muhammad Imran
    Ghamkhar, Madiha
    PAKISTAN JOURNAL OF AGRICULTURAL SCIENCES, 2023, 60 (02): : 327 - 336