Which one is more important in daily runoff forecasting using data driven models: Input data, model type, preprocessing or data length?

被引:29
|
作者
Moosavi, Vahid [1 ]
Fard, Zeinab Gheisoori [1 ]
Vafakhah, Mehdi [1 ]
机构
[1] Tarbiat Modares Univ, Fac Nat Resources & Marine Sci, Dept Watershed Management Engn, Tehran, Iran
关键词
Artificial intelligence; Data driven model; Optimization; Signal processing; Taguchi method; SUPPORT VECTOR MACHINE; NEURAL-NETWORK MODELS; PREDICT SCOUR DEPTH; ABUTMENT SCOUR; PART; DECOMPOSITION; WATER; OPTIMIZATION; SENSITIVITY; SELECTION;
D O I
10.1016/j.jhydrol.2022.127429
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Rainfall-runoff modeling is of great importance in hydrological sciences. Several different models have been developed for runoff modeling in three main categories i.e. physically-based, conceptual and empirical models. Data driven models are of the most widely used models in runoff modeling besides process based models. Different studies have been done to assess the performance of various models and the effect of input datasets, data length and disparate signal processing methods on the modeling performance. However, each of these studies has examined one of these factors separately and didn't assess the effect of these factors on the accuracy of runoff forecasting. Therefore, assessing the importance of each of the mentioned factors as well as determining the optimum structure that produces the best accuracy is still challenging. The main aim of this study was to determine the importance and the optimal combination of these factors in daily runoff modeling. In order to achieve this goal, Taguchi method was used. First, five levels were defined for each of the abovementioned factors. Five different input data combinations, five data driven models i.e. Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Regression (SVR), Group Method of Data Handling (GMDH), Random Forest (RF) and Partial Least Square Regression (PLS), four different signal processing methods i.e. normalization, wavelet, ensemble empirical mode decomposition (EEMD) and singular spectrum analysis (SSA) as well as no pre-processing condition, and five data lengths i.e. 2, 5, 10, 15 and 20 years were considered. The L-25 Taguchi orthogonal array was selected accordingly. The required 25 tests were implemented according to the L-25 Taguchi orthogonal array in three different basins to achieve more generalizable results. The results were then used in Taguchi analysis in order to attain the optimal combination of the levels of the mentioned factors and the importance of these factors in accurate prediction of runoff. Results showed that the hybrid wavelet-GMDH model with a complete dataset as input and 20-year data length provides the highest accuracy. It was also shown that the order of mentioned factors in terms of their importance and effect on runoff prediction accuracy is as follow: input dataset, data length, preprocessing and model type. GMDH and SVR had the best performance and wavelet and EEMD signal processing methods had the highest effect on the data driven models performance.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Applications of Data-driven Models for Daily Discharge Estimation Based on Different Input Combinations
    Kumar, Manish
    Elbeltagi, Ahmed
    Pande, Chaitanya B.
    Ahmed, Ali Najah
    Chow, Ming Fai
    Pham, Quoc Bao
    Kumari, Anuradha
    Kumar, Deepak
    WATER RESOURCES MANAGEMENT, 2022, 36 (07) : 2201 - 2221
  • [22] Applications of Data-driven Models for Daily Discharge Estimation Based on Different Input Combinations
    Manish Kumar
    Ahmed Elbeltagi
    Chaitanya B. Pande
    Ali Najah Ahmed
    Ming Fai Chow
    Quoc Bao Pham
    Anuradha Kumari
    Deepak Kumar
    Water Resources Management, 2022, 36 : 2201 - 2221
  • [23] A long short-term components neural network model with data augmentation for daily runoff forecasting
    Zhang, Jinyu
    Yan, Hua
    JOURNAL OF HYDROLOGY, 2023, 617
  • [24] Assessment of input data selection methods for BOD simulation using data-driven models: a case study
    Azadeh Ahmadi
    Zahra Fatemi
    Sara Nazari
    Environmental Monitoring and Assessment, 2018, 190
  • [25] Assessment of input data selection methods for BOD simulation using data-driven models: a case study
    Ahmadi, Azadeh
    Fatemi, Zahra
    Nazari, Sara
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2018, 190 (04)
  • [26] Data Driven Broiler Weight Forecasting using Dynamic Neural Network Models
    Johansen, Simon V.
    Bendtsen, Jan D.
    Jensen, Martin R. -
    Mogensen, Jesper
    IFAC PAPERSONLINE, 2017, 50 (01): : 5398 - 5403
  • [27] Forecasting cryptocurrencies prices using data driven level set fuzzy models
    Maciel, Leandro
    Ballini, Rosangela
    Gomide, Fernando
    Yager, Ronald
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 210
  • [28] Forecasting operation of a chiller plant facility using data-driven models
    Rizi, Behzad Salimian
    Faramarzi, Afshin
    Pertzborn, Amanda
    Heidarinejad, Mohammad
    INTERNATIONAL JOURNAL OF REFRIGERATION, 2024, 167 : 70 - 89
  • [29] An improved hybrid data-driven model and its application in daily rainfall-runoff simulation
    Kan, Guangyuan
    He, Xiaoyan
    Ding, Liuqian
    Li, Jiren
    Lei, Tianjie
    Liang, Ke
    Hong, Yang
    6TH DIGITAL EARTH SUMMIT, 2016, 46
  • [30] The impact of input data resolution on neural network forecasting models for wind and photovoltaic energy generation using time series data
    AlShafeey, Mutaz
    Csaki, Csaba
    ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY, 2023, 42 (03)