Two-stage meta-ensembling machine learning model for enhanced water quality forecasting

被引:3
|
作者
Heydari, Sepideh [1 ]
Nikoo, Mohammad Reza [2 ]
Mohammadi, Ali [3 ]
Barzegar, Rahim [4 ]
机构
[1] Univ Tehran, Fac Environm Engn, Dept Environm Engn, Tehran, Iran
[2] Sultan Qaboos Univ, Dept Civil & Architectural Engn, Muscat, Oman
[3] Sharif Univ Technol, Dept Ind Engn, Tehran, Iran
[4] Univ Quebec Abitibi Temiscamingue UQAT, Res Inst Mines & Environm RIME, Groundwater Res Grp GRES, Amos, PQ, Canada
关键词
Water quality forecasting; Machine learning; Multi-objective optimization; Genetic algorithm; Grey Wolf Optimizer; Chlorophyll-a and Dissolved oxygen; PREDICTION; IMPLEMENTATION; DECOMPOSITION; PERFORMANCE; STREAMFLOW; RESOURCES; SYSTEM;
D O I
10.1016/j.jhydrol.2024.131767
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Accurate short-term forecasting of water quality variables (WQVs) such as dissolved oxygen (DO) and chlorophyll-a (Chl-a) is crucial for the effective management of aquatic resources. This study introduces a robust two-stage optimization-ensembling framework that integrates the Grey Wolf Optimizer (GWO) and the Nondominated Sorting Genetic Algorithm II (NSGA-II) to enhance the forecasting capabilities of machine learning (ML) models. Focusing on Small Prespa Lake, Greece, we implemented an array of diverse ML techniques, including eXtreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Light Gradient-Boosting Machine (LightGBM), and Multilayer Perceptron (MLP). These models were fine-tuned using GWO to optimize their performance over critical WQVs predicted six hours in advance. Our methodology employed rigorous data preprocessing techniques, including lag time feature engineering and principal component analysis (PCA), to handle the high dimensionality of the dataset. Optimal lag times ranging from 6 to 24 hour were evaluated, with the 24-hour lag proving to be the most effective in utilizing historical data to enhance forecasting accuracy. The GWO not only facilitated hyperparameter tuning but also demonstrated a notable improvement (7.6%) in the Kling-Gupta Efficiency (KGE) over conventional randomized search methods. Subsequently, the NSGA-II was utilized for multi-objective optimization, constructing powerful model ensembles that outperformed the individual GWO-optimized models by up to a 7% in KGE. In comparision to a standard genetic algorithm-based ensemble, the NSGA-II ensemble demonstrated superior effectiveness in balancing solution quality. This innovative approach not only establishes a new benchmark in water quality forecasting but also contributes substantially to proactive environmental monitoring and management strategies.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Toward fully automated UED operation using two-stage machine learning model
    Zhe Zhang
    Xi Yang
    Xiaobiao Huang
    Timur Shaftan
    Victor Smaluk
    Minghao Song
    Weishi Wan
    Lijun Wu
    Yimei Zhu
    Scientific Reports, 12
  • [22] Toward fully automated UED operation using two-stage machine learning model
    Zhang, Zhe
    Yang, Xi
    Huang, Xiaobiao
    Shaftan, Timur
    Smaluk, Victor
    Song, Minghao
    Wan, Weishi
    Wu, Lijun
    Zhu, Yimei
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [23] Fast automatic two-stage nonlinear model identification based on the extreme learning machine
    Deng, Jing
    Li, Kang
    Irwin, George W.
    NEUROCOMPUTING, 2011, 74 (16) : 2422 - 2429
  • [24] Efficient Two-stage Model Retraining for Machine Unlearning
    Kim, Junyaup
    Woo, Simon S.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4360 - 4368
  • [25] Impact of a two-stage ditch on channel water quality
    Hodaj, Andi
    Bowling, Laura C.
    Frankenberger, Jane R.
    Chaubey, Indrajeet
    AGRICULTURAL WATER MANAGEMENT, 2017, 192 : 126 - 137
  • [26] Machine Learning-Based Two-Stage Data Selection Scheme for Long-Term Influenza Forecasting
    Moon, Jaeuk
    Jung, Seungwon
    Park, Sungwoo
    Hwang, Eenjun
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (03): : 2945 - 2959
  • [27] A two-stage stochastic programming model for collaborative asset protection routing problem enhanced with machine learning: a learning-based matheuristic algorithm
    Nikzad, Erfaneh
    Bashiri, Mahdi
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2023, 61 (01) : 81 - 113
  • [28] An explainable two-stage machine learning approach for precipitation forecast
    Senocak, Ali Ulvi Galip
    Yilmaz, M. Tugrul
    Kalkan, Sinan
    Yucel, Ismail
    Amjad, Muhammad
    JOURNAL OF HYDROLOGY, 2023, 627
  • [29] Enhance The Performance Of Navigation: A Two-Stage Machine Learning Approach
    Fan, Yimin
    Wang, Zhiyuan
    Lin, Yuanpeng
    Tan, Haisheng
    2020 6TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2020), 2020, : 212 - 219
  • [30] A machine learning approach to two-stage adaptive robust optimization
    Bertsimas, Dimitris
    Kim, Cheol Woo
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 319 (01) : 16 - 30