Assessing temporal data partitioning scenarios for estimating reference evapotranspiration with machine learning techniques in arid regions

被引:27
|
作者
Kazemi, Mohammad Hossein [1 ]
Shiri, Jalal [1 ,2 ]
Marti, Pau [3 ]
Majnooni-Heris, Abolfazl [1 ]
机构
[1] Univ Tabriz, Fac Agr, Water Engn Dept, Tabriz, Iran
[2] Univ Tabriz, Fac Civil Engn, Ctr Excellence Hydroinformat, Tabriz, Iran
[3] Univ Illes Balears, Area Engn Agroforestal, Carretera Valldemossa Km 7-5, Palma De Mallorca 07022, Spain
关键词
Evapotranspiration; Gene expression programming; Hold out; K-fold validation; MODELING REFERENCE EVAPOTRANSPIRATION; NEURAL-NETWORKS; TIME-SERIES; TEMPERATURE; ALGORITHMS; STRATEGIES; EQUATIONS; SELECTION;
D O I
10.1016/j.jhydrol.2020.125252
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Recently, data driven machine learning techniques has been widely applied for modeling reference evapotranspiration (ETo) values under various climatic conditions taking into account the different number of sites and available data length. A major issue with applying those models is the proper selection of training/testing data sets. Although some spatial generalization approaches have been recommended for this purpose, there are no specified recommended local (temporal) data partitioning strategies for machine learning based ETo estimation. The present study evaluates different hold-out and k-fold validation temporal data partitioning strategies when using gene expression programming (GEP) technique to estimate daily ETo in arid regions. The k-fold validation strategies considered annual, monthly and growing season period patterns as test data sets. Although commonly used partitioning of the available patterns into training and testing sets gave accurate results, statistical analysis showed that the results obtained through k-fold validation assessment were more reliable. A two-block partitioning strategy with chronologic data selection for training and testing provided the most accurate results among the hold-out procedures (mean scatter index (SI) value of 0.162). Fixing the extreme ETo values as training data set in hold-out procedures provided the less accurate results with considerable over/underestimation of the ETo values (mean SI value was 0.506). Results on the basis of hold-out approaches can be biased or only partially valid depending on selection of the test data from the time series. K-fold validation yielded the lowest over/underestimations of ETo values. Further, considering monthly patterns as minimum affordable test size produced higher error magnitudes among k-fold validation strategies, while considering the complete patterns of one growing season provided more accurate results among k-fold validation strategies.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Using Ensembles of Machine Learning Techniques to Predict Reference Evapotranspiration (ET0) Using Limited Meteorological Data
    Salahudin, Hamza
    Shoaib, Muhammad
    Albano, Raffaele
    Baig, Muhammad AzharInam Inam
    Hammad, Muhammad
    Raza, Ali
    Akhtar, Alamgir
    Ali, Muhammad Usman
    HYDROLOGY, 2023, 10 (08)
  • [32] Data-driven reference evapotranspiration (ET0) estimation: a comparative study of regression and machine learning techniques
    Jitendra Rajput
    Man Singh
    K. Lal
    Manoj Khanna
    A. Sarangi
    J. Mukherjee
    Shrawan Singh
    Environment, Development and Sustainability, 2024, 26 : 12679 - 12706
  • [33] General and regional cross-station assessment of machine learning models for estimating reference evapotranspiration
    Yasser Zouzou
    Hatice Citakoglu
    Acta Geophysica, 2023, 71 : 927 - 947
  • [34] Which Machine Learning Algorithm Is Best Suited for Estimating Reference Evapotranspiration in Humid Subtropical Climate?
    Deb, Proloy
    Kumar, Virender
    Urfels, Anton
    Lautze, Jonathan
    Kamboj, Baldev Raj
    Sharma, Jeet Ram
    Yadav, Sudhir
    CLEAN-SOIL AIR WATER, 2025, 53 (01)
  • [35] General and regional cross-station assessment of machine learning models for estimating reference evapotranspiration
    Zouzou, Yasser
    Citakoglu, Hatice
    ACTA GEOPHYSICA, 2023, 71 (02) : 927 - 947
  • [36] Data-driven reference evapotranspiration (ET0) estimation: a comparative study of regression and machine learning techniques
    Rajput, Jitendra
    Singh, Man
    Lal, K.
    Khanna, Manoj
    Sarangi, A.
    Mukherjee, J.
    Singh, Shrawan
    ENVIRONMENT DEVELOPMENT AND SUSTAINABILITY, 2024, 26 (05) : 12679 - 12706
  • [37] Performance evaluation of numerical and machine learning methods in estimating reference evapotranspiration in a Brazilian agricultural frontier
    dos Santos Farias, Diego Bispo
    Althoff, Daniel
    Rodrigues, Lineu Neiva
    Filgueiras, Roberto
    THEORETICAL AND APPLIED CLIMATOLOGY, 2020, 142 (3-4) : 1481 - 1492
  • [38] Deep Machine Learning for Forecasting Daily Potential Evapotranspiration in Arid Regions, Case: Atacama Desert Header
    Pino-Vargas, Edwin
    Taya-Acosta, Edgar
    Ingol-Blanco, Eusebio
    Torres-Rua, Alfonso
    AGRICULTURE-BASEL, 2022, 12 (12):
  • [39] Development of Boosted Machine Learning Models for Estimating Daily Reference Evapotranspiration and Comparison with Empirical Approaches
    Mehdizadeh, Saeid
    Mohammadi, Babak
    Quoc Bao Pham
    Duan, Zheng
    WATER, 2021, 13 (24)
  • [40] Performance evaluation of numerical and machine learning methods in estimating reference evapotranspiration in a Brazilian agricultural frontier
    Diego Bispo dos Santos Farias
    Daniel Althoff
    Lineu Neiva Rodrigues
    Roberto Filgueiras
    Theoretical and Applied Climatology, 2020, 142 : 1481 - 1492