Comparison of missing data imputation methods using weather data

被引:1
|
作者
Nida, Hafiza [1 ]
Kashif, Muhammad [1 ]
Khan, Muhammad Imran [1 ]
Ghamkhar, Madiha [1 ]
机构
[1] Univ Agr Faisalabad, Fac Sci, Dept Math & Stat, Faisalabad, Pakistan
来源
关键词
Rainfall; temperature; missing data; imputation methods; root mean square error; TEMPERATURE; PAKISTAN; CLIMATE; CROP;
D O I
10.21162/PAKJAS/23.228
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Researchers and data analysts commonly experience challenges while dealing with missing data for analyzing large data sets in their respective field of studies. It is necessary to handle missing data properly to obtain better and more reliable outcomes about any research. The objective of this research is to evaluate different imputation techniques for handling missing observations occurred in the weather data. For this purpose, weather data of the variables: daily rainfall, maximum temperature (Tmax) and minimum temperature (Tmin) of 23 stations of Pakistan have been taken from Pakistan Metrological department for the years 1981 to 2020. There are about 14610 total observations of each variable while each variable has different number of missing observations, called as size of missingness, at different stations. The techniques: mean imputation, k nearest neighbors (KNN) imputation, predictive mean matching (PMM) imputation and sample imputation have been considered for the estimation of missing observations found while analyzing data of each station. The minimal value of root mean square error (RMSE) is considered to decide about station-wise imputation technique because the size of missingness varied from station to station. The KNN technique is the most appropriate to estimate the missing observations of the rainfall variables for all the stations while mean imputation technique is recommended for Tmax and Tmin data; as compared to other imputation methods.
引用
收藏
页码:327 / 336
页数:10
相关论文
共 50 条
  • [1] Imputation of missing longitudinal data: a comparison of methods
    Engels, JM
    Diehr, P
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2003, 56 (10) : 968 - 976
  • [2] Missing traffic data: comparison of imputation methods
    Li, Yuebiao
    Li, Zhiheng
    Li, Li
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2014, 8 (01) : 51 - 57
  • [3] A comparison of imputation methods for the consecutive missing temperature data
    Kim, Hee-Kyung
    Kang, In-Kyeong
    Lee, Jae-Won
    Lee, Yung-Seop
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (03) : 549 - 557
  • [4] Missing Data and Imputation Methods
    Schober, Patrick
    Vetter, Thomas R.
    [J]. ANESTHESIA AND ANALGESIA, 2020, 131 (05): : 1419 - 1420
  • [5] Application and Comparison of Imputation Methods for Missing Degradation Data
    Fan, Ye
    Sun, Fuqiang
    Jiang, Tongmin
    [J]. ENGINEERING ASSET MANAGEMENT - SYSTEMS, PROFESSIONAL PRACTICES AND CERTIFICATION, 2015, : 1607 - 1614
  • [6] Comparison of imputation methods for missing laboratory data in medicine
    Waljee, Akbar K.
    Mukherjee, Ashin
    Singal, Amit G.
    Zhang, Yiwei
    Warren, Jeffrey
    Balis, Ulysses
    Marrero, Jorge
    Zhu, Ji
    Higgins, Peter D. R.
    [J]. BMJ OPEN, 2013, 3 (08):
  • [7] Missing Network Data A Comparison of Different Imputation Methods
    Krause, Robert W.
    Huisman, Mark
    Steglich, Christian
    Snijders, Tom A. B.
    [J]. 2018 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM), 2018, : 159 - 163
  • [8] Comparison of Missing Data Imputation Methods using the Framingham Heart study dataset
    Psychogyios, Konstantinos
    Ilias, Loukas
    Askounis, Dimitris
    [J]. 2022 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI) JOINTLY ORGANISED WITH THE IEEE-EMBS INTERNATIONAL CONFERENCE ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS (BSN'22), 2022,
  • [9] New imputation methods for missing data using quantiles
    Munoz, J. F.
    Rueda, M.
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2009, 232 (02) : 305 - 317
  • [10] Comparison of imputation methods for missing production data of dairy cattle
    You, J.
    Ellis, J. L.
    Adams, S.
    Sahar, M.
    Jacobs, M.
    Tulpan, D.
    [J]. ANIMAL, 2023, 17