Which one is more important in daily runoff forecasting using data driven models: Input data, model type, preprocessing or data length?

被引:29
|
作者
Moosavi, Vahid [1 ]
Fard, Zeinab Gheisoori [1 ]
Vafakhah, Mehdi [1 ]
机构
[1] Tarbiat Modares Univ, Fac Nat Resources & Marine Sci, Dept Watershed Management Engn, Tehran, Iran
关键词
Artificial intelligence; Data driven model; Optimization; Signal processing; Taguchi method; SUPPORT VECTOR MACHINE; NEURAL-NETWORK MODELS; PREDICT SCOUR DEPTH; ABUTMENT SCOUR; PART; DECOMPOSITION; WATER; OPTIMIZATION; SENSITIVITY; SELECTION;
D O I
10.1016/j.jhydrol.2022.127429
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Rainfall-runoff modeling is of great importance in hydrological sciences. Several different models have been developed for runoff modeling in three main categories i.e. physically-based, conceptual and empirical models. Data driven models are of the most widely used models in runoff modeling besides process based models. Different studies have been done to assess the performance of various models and the effect of input datasets, data length and disparate signal processing methods on the modeling performance. However, each of these studies has examined one of these factors separately and didn't assess the effect of these factors on the accuracy of runoff forecasting. Therefore, assessing the importance of each of the mentioned factors as well as determining the optimum structure that produces the best accuracy is still challenging. The main aim of this study was to determine the importance and the optimal combination of these factors in daily runoff modeling. In order to achieve this goal, Taguchi method was used. First, five levels were defined for each of the abovementioned factors. Five different input data combinations, five data driven models i.e. Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Regression (SVR), Group Method of Data Handling (GMDH), Random Forest (RF) and Partial Least Square Regression (PLS), four different signal processing methods i.e. normalization, wavelet, ensemble empirical mode decomposition (EEMD) and singular spectrum analysis (SSA) as well as no pre-processing condition, and five data lengths i.e. 2, 5, 10, 15 and 20 years were considered. The L-25 Taguchi orthogonal array was selected accordingly. The required 25 tests were implemented according to the L-25 Taguchi orthogonal array in three different basins to achieve more generalizable results. The results were then used in Taguchi analysis in order to attain the optimal combination of the levels of the mentioned factors and the importance of these factors in accurate prediction of runoff. Results showed that the hybrid wavelet-GMDH model with a complete dataset as input and 20-year data length provides the highest accuracy. It was also shown that the order of mentioned factors in terms of their importance and effect on runoff prediction accuracy is as follow: input dataset, data length, preprocessing and model type. GMDH and SVR had the best performance and wavelet and EEMD signal processing methods had the highest effect on the data driven models performance.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Experimental Evaluation of Model Predictive Control using Data Driven Models
    Paranjape, Pournima Vikas
    Patel, Nitinkumar, V
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 1187 - 1191
  • [42] Modeling COVID-19 Disease with Deterministic and Data-Driven Models Using Daily Empirical Data in the United Kingdom
    Agbaje, Janet O.
    Babasola, Oluwatosin
    Adeyemo, Kabiru Michael
    Zhiri, Abraham Baba
    Adigun, Aanuoluwapo Joshua
    Lawal, Samuel Adefisoye
    Nuga, Oluwole Adegoke
    Abah, Roseline Toyin
    Adam, Umar Muhammad
    Oshinubi, Kayode
    COVID, 2024, 4 (02): : 289 - 316
  • [43] Intelligent Demand Forecasting of Smelting Process Using Data-Driven and Mechanism Model
    Yang, Jie
    Chai, Tianyou
    Luo, Chaomin
    Yu, Wen
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9745 - 9755
  • [44] Using data-driven agent-based models for forecasting emerging infectious diseases
    Venkatramanan, Srinivasan
    Lewis, Bryan
    Chen, Jiangzhuo
    Higdon, Dave
    Vullikanti, Anil
    Marathe, Madhav
    EPIDEMICS, 2018, 22 : 43 - 49
  • [45] Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer, India
    Elbeltagi, Ahmed
    Kumar, Manish
    Kushwaha, N. L.
    Pande, Chaitanya B.
    Ditthakit, Pakorn
    Vishwakarma, Dinesh Kumar
    Subeesh, A.
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2023, 37 (01) : 113 - 131
  • [46] Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer, India
    Ahmed Elbeltagi
    Manish Kumar
    N. L. Kushwaha
    Chaitanya B. Pande
    Pakorn Ditthakit
    Dinesh Kumar Vishwakarma
    A. Subeesh
    Stochastic Environmental Research and Risk Assessment, 2023, 37 : 113 - 131
  • [47] Predicting city-scale daily electricity consumption using data-driven models
    Wang, Zhe
    Hong, Tianzhen
    Li, Han
    Piette, Mary Ann
    ADVANCES IN APPLIED ENERGY, 2021, 2
  • [48] Including spatial distribution in a data-driven rainfall-runoff model to improve reservoir inflow forecasting in Taiwan
    Tsai, Meng-Jung
    Abrahart, Robert J.
    Mount, Nick J.
    Chang, Fi-John
    HYDROLOGICAL PROCESSES, 2014, 28 (03) : 1055 - 1070
  • [49] A novel hybrid model for multi-step daily AQI forecasting driven by air pollution big data
    Xu, Yinan
    Liu, Hui
    Duan, Zhu
    AIR QUALITY ATMOSPHERE AND HEALTH, 2020, 13 (02): : 197 - 207
  • [50] A novel hybrid model for multi-step daily AQI forecasting driven by air pollution big data
    Yinan Xu
    Hui Liu
    Zhu Duan
    Air Quality, Atmosphere & Health, 2020, 13 : 197 - 207