Performance Evaluation of Predictive Models for Missing Data Imputation in Weather Data

被引:0
|
作者
Doreswamy [1 ]
Gad, Ibrahim [1 ,2 ]
Manjunatha, B. R. [3 ]
机构
[1] Mangalore Univ, Dept Comp Sci, Mangalore, Karnataka, India
[2] Tanta Univ, Fac Sci, Tanta, Egypt
[3] Mangalore Univ, Dept Marine Geol, Mangalore, Karnataka, India
关键词
Missing data; Imputation; NCDC data set; Weather analysis; SVM; KNN; VALUES;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real datasets can have missing values for a different reasons such as in data that were not kept on file and data corruption. Climate forecasting has a highly relevant effect in agricultural fields and industries sectors. The process of predicting climate conditions is required for different areas of life sectors. Handling missing data is significant because a lot of machine learning algorithms performance are affected by missing values in addition, they do not support data with missing values. Various techniques have been used to process missing data problem and the most applied is removing any row that contains at least one missing value. Also, another approaches to solve missing data problems are to impute the missing data to yield a more complete dataset. In order to improve the accuracy of prediction with the climate data, missing value from dataset should be removed or imputed/predicted in the pre-processing phase before using the data for prediction or clustering in the analysis step. In this paper, we propose a new technique to handle missing values in weather data using machine learning algorithms by execute experiments with NCDC dataset to evaluate the prediction error of five methods namely the kernel ridge, linear regression, random forest, SVM imputation and KNN imputation procedure. The missing values were imputed using each method and compared to the observed value. Results of the proposed method were compared with existing techniques.
引用
收藏
页码:1327 / 1334
页数:8
相关论文
共 50 条
  • [1] Imputation is beneficial for handling missing data in predictive models
    Steyerberg, Ewout W.
    van Veen, Mirjam
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2007, 60 (09) : 979 - 979
  • [2] Comparison of missing data imputation methods using weather data
    Nida, Hafiza
    Kashif, Muhammad
    Khan, Muhammad Imran
    Ghamkhar, Madiha
    [J]. PAKISTAN JOURNAL OF AGRICULTURAL SCIENCES, 2023, 60 (02): : 327 - 336
  • [3] EVALUATION OF MISSING DATA IMPUTATION STRATEGIES IN CLINICAL TRIAL AND EMR DATA USING STANDARDIZED DATA MODELS
    McLean, C.
    Ransom, J.
    Galaznik, A.
    [J]. VALUE IN HEALTH, 2019, 22 : S520 - S520
  • [4] Imputation methods for missing data for polygenic models
    Brooke Fridley
    Kari Rabe
    Mariza de Andrade
    [J]. BMC Genetics, 4
  • [5] Imputation Estimators for Unnormalized Models with Missing Data
    Uehara, Masatoshi
    Matsuda, Takeru
    Kim, Jae Kwang
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [6] Imputation methods for missing data for polygenic models
    Fridley, B
    Rabe, K
    de Andrade, M
    [J]. BMC GENETICS, 2003, 4 (Suppl 1)
  • [7] IMPUTATION OF MISSING DATA
    Lunt, M.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2014, 73 : 49 - 49
  • [8] Does the Missing Data Imputation Method Affect the Composition and Performance of Prognostic Models?
    Baneshi, M. R.
    Talei, A. R.
    [J]. IRANIAN RED CRESCENT MEDICAL JOURNAL, 2012, 14 (01) : 31 - 36
  • [9] Multiple imputation for missing edge data: A predictive evaluation method with application to Add Health
    Wang, Cheng
    Butts, Carter T.
    Hipp, John R.
    Jose, Rupa
    Lakon, Cynthia M.
    [J]. SOCIAL NETWORKS, 2016, 45 : 89 - 98
  • [10] Identifiable Generative Models for Missing Not at Random Data Imputation
    Ma, Chao
    Zhang, Cheng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34