Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis

被引:2
|
作者
Thongsri, Thidarat [1 ]
Samart, Klairung [1 ]
机构
[1] Prince Songkla Univ, Fac Sci, Div Computat Sci, Stat & Applicat Res Unit, Hat Yai, Thailand
关键词
missing data; imputation method; composite method; multiple linear regression; HOT DECK IMPUTATION;
D O I
10.1134/S1995080222140323
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Missing data is a common issue in many domains of study. If this issue is disregarded, the erroneous conclusion may be reached. This study's objective is to develop and compared the efficiency of eight imputation methods: hot deck imputation (HD), k-nearest neighbors imputation (KNN), stochastic regression, imputation (SR), predictive mean matching imputation (PMM), random forest imputation (RF), stochastic regression random forest with equivalent weight imputation (SREW), k-nearest random forest with equivalent weight imputation (KREW), and k-nearest stochastic regression and random forest with equivalent weight imputation (KSREW). In this study, the simulation was run using sample sizes of 30, 60, 100, and 150, and missing percentages of 10%, 20%, 30%, and 40%. The average mean square error (AMSE) was used to compare efficiency. The results reveal that the proposed composite approaches outperformed the single ones, particularly a three-component method called KSREW. Increasing the number of components to a four-component method, on the other hand, has no effect on imputation performance.
引用
收藏
页码:3390 / 3399
页数:10
相关论文
共 50 条
  • [1] Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis
    Thidarat Thongsri
    Klairung Samart
    Lobachevskii Journal of Mathematics, 2022, 43 : 3390 - 3399
  • [2] Imputation Methods for Multiple Regression with Missing Heteroscedastic Data
    Asif, Muhammad
    Samart, Klairung
    THAILAND STATISTICIAN, 2022, 20 (01): : 1 - 15
  • [3] Regression multiple imputation for missing data analysis
    Yu, Lili
    Liu, Liang
    Peace, Karl E.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (09) : 2647 - 2664
  • [4] Composite Imputation Method for the Multiple Linear Regression with Missing at Random Data
    Thongsri, Thidarat
    Samart, Klairung
    INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2022, 17 (01): : 51 - 62
  • [5] Analysis of Missing Data in Progressed Learners: The Use of Multiple Imputation Methods
    Mabungane, S.
    Ramroop, S.
    Mwambi, H.
    AFRICAN JOURNAL OF RESEARCH IN MATHEMATICS SCIENCE AND TECHNOLOGY EDUCATION, 2023, 27 (02) : 112 - 122
  • [6] Multiple Imputation for Missing Data via Sequential Regression Trees
    Burgette, Lane F.
    Reiter, Jerome P.
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 172 (09) : 1070 - 1076
  • [7] DATA ENVELOPMENT ANALYSIS WITH MISSING DATA: A MULTIPLE LINEAR REGRESSION ANALYSIS APPROACH
    Chen, Ya
    Li, Yongjun
    Wu, Huaqing
    Liang, Liang
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2014, 13 (01) : 137 - 153
  • [8] The use of multiple imputation for the analysis of missing data
    Sinharay, S
    Stern, HS
    Russell, D
    PSYCHOLOGICAL METHODS, 2001, 6 (04) : 317 - 329
  • [9] Multiple imputation of missing data for survey data analysis
    Lupo, Coralie
    Le Bouquin, Sophie
    Michel, Virginie
    Colin, Pierre
    Chauvin, Claire
    EPIDEMIOLOGIE ET SANTE ANIMALE, 2008, NO 53, 2008, (53): : 73 - 83
  • [10] Missing Data and Imputation Methods
    Schober, Patrick
    Vetter, Thomas R.
    ANESTHESIA AND ANALGESIA, 2020, 131 (05): : 1419 - 1420