Which one is more important in daily runoff forecasting using data driven models: Input data, model type, preprocessing or data length?

被引：29

作者：

Moosavi, Vahid ^{[1
]}

Fard, Zeinab Gheisoori ^{[1
]}

Vafakhah, Mehdi ^{[1
]}

机构：

[1] Tarbiat Modares Univ, Fac Nat Resources & Marine Sci, Dept Watershed Management Engn, Tehran, Iran

来源：

JOURNAL OF HYDROLOGY | 2022年 / 606卷

关键词：

Artificial intelligence; Data driven model; Optimization; Signal processing; Taguchi method; SUPPORT VECTOR MACHINE; NEURAL-NETWORK MODELS; PREDICT SCOUR DEPTH; ABUTMENT SCOUR; PART; DECOMPOSITION; WATER; OPTIMIZATION; SENSITIVITY; SELECTION;

D O I：

10.1016/j.jhydrol.2022.127429

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Rainfall-runoff modeling is of great importance in hydrological sciences. Several different models have been developed for runoff modeling in three main categories i.e. physically-based, conceptual and empirical models. Data driven models are of the most widely used models in runoff modeling besides process based models. Different studies have been done to assess the performance of various models and the effect of input datasets, data length and disparate signal processing methods on the modeling performance. However, each of these studies has examined one of these factors separately and didn't assess the effect of these factors on the accuracy of runoff forecasting. Therefore, assessing the importance of each of the mentioned factors as well as determining the optimum structure that produces the best accuracy is still challenging. The main aim of this study was to determine the importance and the optimal combination of these factors in daily runoff modeling. In order to achieve this goal, Taguchi method was used. First, five levels were defined for each of the abovementioned factors. Five different input data combinations, five data driven models i.e. Adaptive Neuro-Fuzzy Inference System (ANFIS), Support Vector Regression (SVR), Group Method of Data Handling (GMDH), Random Forest (RF) and Partial Least Square Regression (PLS), four different signal processing methods i.e. normalization, wavelet, ensemble empirical mode decomposition (EEMD) and singular spectrum analysis (SSA) as well as no pre-processing condition, and five data lengths i.e. 2, 5, 10, 15 and 20 years were considered. The L-25 Taguchi orthogonal array was selected accordingly. The required 25 tests were implemented according to the L-25 Taguchi orthogonal array in three different basins to achieve more generalizable results. The results were then used in Taguchi analysis in order to attain the optimal combination of the levels of the mentioned factors and the importance of these factors in accurate prediction of runoff. Results showed that the hybrid wavelet-GMDH model with a complete dataset as input and 20-year data length provides the highest accuracy. It was also shown that the order of mentioned factors in terms of their importance and effect on runoff prediction accuracy is as follow: input dataset, data length, preprocessing and model type. GMDH and SVR had the best performance and wavelet and EEMD signal processing methods had the highest effect on the data driven models performance.

引用

页数：13

共 50 条

[41] Experimental Evaluation of Model Predictive Control using Data Driven Models
Paranjape, Pournima Vikas
Patel, Nitinkumar, V
2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 1187 - 1191
[42] Modeling COVID-19 Disease with Deterministic and Data-Driven Models Using Daily Empirical Data in the United Kingdom
Agbaje, Janet O.
Babasola, Oluwatosin
Adeyemo, Kabiru Michael
Zhiri, Abraham Baba
Adigun, Aanuoluwapo Joshua
Lawal, Samuel Adefisoye
Nuga, Oluwole Adegoke
Abah, Roseline Toyin
Adam, Umar Muhammad
Oshinubi, Kayode
COVID, 2024, 4 (02): : 289 - 316
[43] Intelligent Demand Forecasting of Smelting Process Using Data-Driven and Mechanism Model
Yang, Jie
Chai, Tianyou
Luo, Chaomin
Yu, Wen
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9745 - 9755
[44] Using data-driven agent-based models for forecasting emerging infectious diseases
Venkatramanan, Srinivasan
Lewis, Bryan
Chen, Jiangzhuo
Higdon, Dave
Vullikanti, Anil
Marathe, Madhav
EPIDEMICS, 2018, 22 : 43 - 49
[45] Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer, India
Elbeltagi, Ahmed
Kumar, Manish
Kushwaha, N. L.
Pande, Chaitanya B.
Ditthakit, Pakorn
Vishwakarma, Dinesh Kumar
Subeesh, A.
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2023, 37 (01) : 113 - 131
[46] Drought indicator analysis and forecasting using data driven models: case study in Jaisalmer, India
Ahmed Elbeltagi
Manish Kumar
N. L. Kushwaha
Chaitanya B. Pande
Pakorn Ditthakit
Dinesh Kumar Vishwakarma
A. Subeesh
Stochastic Environmental Research and Risk Assessment, 2023, 37 : 113 - 131
[47] Predicting city-scale daily electricity consumption using data-driven models
Wang, Zhe
Hong, Tianzhen
Li, Han
Piette, Mary Ann
ADVANCES IN APPLIED ENERGY, 2021, 2
[48] Including spatial distribution in a data-driven rainfall-runoff model to improve reservoir inflow forecasting in Taiwan
Tsai, Meng-Jung
Abrahart, Robert J.
Mount, Nick J.
Chang, Fi-John
HYDROLOGICAL PROCESSES, 2014, 28 (03) : 1055 - 1070
[49] A novel hybrid model for multi-step daily AQI forecasting driven by air pollution big data
Xu, Yinan
Liu, Hui
Duan, Zhu
AIR QUALITY ATMOSPHERE AND HEALTH, 2020, 13 (02): : 197 - 207
[50] A novel hybrid model for multi-step daily AQI forecasting driven by air pollution big data
Yinan Xu
Hui Liu
Zhu Duan
Air Quality, Atmosphere & Health, 2020, 13 : 197 - 207

← 1 2 3 4 5 →