A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria

被引:37
|
作者
Aieb, Amir [1 ,2 ]
Madani, Khodir [1 ]
Scarpa, Marco [3 ]
Bonaccorso, Brunella [3 ]
Lefsih, Khalef [1 ]
机构
[1] Univ Bejaia, L3BS, Bejaia 06000, Algeria
[2] Abderrahmane Mira Univ, Fac Exact Sci, Dept Comp Sci, Bejaia 06000, Algeria
[3] Univ Messina, Dept Engn, Messina, Italy
关键词
Atmospheric science; Environmental science; Hydrology; TIME-SERIES; IMPUTATION; SATELLITE; GAPS; PCA;
D O I
10.1016/j.heliyon.2019.e01247
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k-nearest-neighbors imputation, weighted k-nearest-neighbors imputation, multiple imputation, linear regression and simple average method. The choice of these methods was justified by qualitative and quantitative statistical tests analysis. However, the reliability of obtained results depends mainly on percentage of missing data, choice of neighboring stations and data missingness mechanism which should be missing at random. During the study it was found that the most of stations in Soummam watershed don't have a good correlation because the large loss in rainfall data or the geology of watershed which gives a relationship between station position and rainfall variability. For this case, principal component analysis is applied on a set of stations; it showed a positive impact of altitude, latitude and longitude on correlation index between selected stations. The graphical analysis of the normal law on RMSE values, which were obtained by applying the proposed technique in several random cases of missingness, that are 4%, 8%, 12% and 16% respectively, it confirmed the validity and the performance of this approach.
引用
收藏
页数:27
相关论文
共 3 条
  • [1] A methodology for treating missing data applied to daily rainfall data in the Candelaro River Basin (Italy)
    Rossella Lo Presti
    Emanuele Barca
    Giuseppe Passarella
    Environmental Monitoring and Assessment, 2010, 160 : 1 - 22
  • [2] A methodology for treating missing data applied to daily rainfall data in the Candelaro River Basin (Italy)
    Lo Presti, Rossella
    Barca, Emanuele
    Passarella, Giuseppe
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2010, 160 (1-4) : 1 - 22
  • [3] Analysis of a new spatial interpolation weighting method to estimate missing data applied to rainfall records
    Morales Martinez, Jorge Luis
    Antonio Horta-Rangel, Francisco
    Segovia-Dominguez, Ignacio
    Robles Morua, Agustin
    Horacio Hernandez, J.
    ATMOSFERA, 2019, 32 (03): : 237 - 259