Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis

被引:2
|
作者
Thongsri, Thidarat [1 ]
Samart, Klairung [1 ]
机构
[1] Prince Songkla Univ, Fac Sci, Div Computat Sci, Stat & Applicat Res Unit, Hat Yai, Thailand
关键词
missing data; imputation method; composite method; multiple linear regression; HOT DECK IMPUTATION;
D O I
10.1134/S1995080222140323
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Missing data is a common issue in many domains of study. If this issue is disregarded, the erroneous conclusion may be reached. This study's objective is to develop and compared the efficiency of eight imputation methods: hot deck imputation (HD), k-nearest neighbors imputation (KNN), stochastic regression, imputation (SR), predictive mean matching imputation (PMM), random forest imputation (RF), stochastic regression random forest with equivalent weight imputation (SREW), k-nearest random forest with equivalent weight imputation (KREW), and k-nearest stochastic regression and random forest with equivalent weight imputation (KSREW). In this study, the simulation was run using sample sizes of 30, 60, 100, and 150, and missing percentages of 10%, 20%, 30%, and 40%. The average mean square error (AMSE) was used to compare efficiency. The results reveal that the proposed composite approaches outperformed the single ones, particularly a three-component method called KSREW. Increasing the number of components to a four-component method, on the other hand, has no effect on imputation performance.
引用
收藏
页码:3390 / 3399
页数:10
相关论文
共 50 条
  • [41] Evaluation of Multiple Imputation Methods for Missing Diary Data for Statistical Analysis in Dry Eye Studies
    Slade, Lot
    Bateman, Kirk
    Usner, Dale W.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2018, 59 (09)
  • [42] A comparison of multiple imputation methods for the analysis of survival data with outcome related missing covariate values
    Silva, Jose Luiz P.
    SIGMAE, 2023, 12 (01): : 76 - 89
  • [43] Missing Data and Multiple Imputation in the Context of Multivariate Analysis of Variance
    Finch, W. Holmes
    JOURNAL OF EXPERIMENTAL EDUCATION, 2016, 84 (02): : 356 - 372
  • [44] Estimation of logistic regression with covariates missing separately or simultaneously via multiple imputation methods
    Lee, Shen-Ming
    Le, Truong-Nhat
    Tran, Phuoc-Loc
    Li, Chin-Shang
    COMPUTATIONAL STATISTICS, 2023, 38 (02) : 899 - 934
  • [45] Multiple imputation of missing fMRI data in whole brain analysis
    Vaden, Kenneth I., Jr.
    Gebregziabher, Mulugeta
    Kuchinsky, Stefanie E.
    Eckert, Marl A.
    NEUROIMAGE, 2012, 60 (03) : 1843 - 1855
  • [46] Estimation of logistic regression with covariates missing separately or simultaneously via multiple imputation methods
    Shen-Ming Lee
    Truong-Nhat Le
    Phuoc-Loc Tran
    Chin-Shang Li
    Computational Statistics, 2023, 38 : 899 - 934
  • [47] Mediation Analysis with Missing Data Through Multiple Imputation and Bootstrap
    Zhang, Zhiyong
    Wang, Lijuan
    Tong, Xin
    Quantitative Psychology Research, 2015, 140 : 341 - 355
  • [48] Confidence intervals for marginal parameters under fractional linear regression imputation for missing data
    Qin, Yongsong
    Rao, J. N. K.
    Ren, Qunshu
    JOURNAL OF MULTIVARIATE ANALYSIS, 2008, 99 (06) : 1232 - 1259
  • [49] MISSING DATA IN FACTOR ANALYSIS AND MULTIPLE REGRESSION
    MACKELPRANG, AJ
    MIDWEST JOURNAL OF POLITICAL SCIENCE, 1970, 14 (03): : 493 - 505
  • [50] Comparison of imputation and imputation-free methods for statistical analysis of mass spectrometry data with missing data
    Taylor, Sandra
    Ponzini, Matthew
    Wilson, Machelle
    Kim, Kyoungmi
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)