Two-stage meta-ensembling machine learning model for enhanced water quality forecasting

被引:3
|
作者
Heydari, Sepideh [1 ]
Nikoo, Mohammad Reza [2 ]
Mohammadi, Ali [3 ]
Barzegar, Rahim [4 ]
机构
[1] Univ Tehran, Fac Environm Engn, Dept Environm Engn, Tehran, Iran
[2] Sultan Qaboos Univ, Dept Civil & Architectural Engn, Muscat, Oman
[3] Sharif Univ Technol, Dept Ind Engn, Tehran, Iran
[4] Univ Quebec Abitibi Temiscamingue UQAT, Res Inst Mines & Environm RIME, Groundwater Res Grp GRES, Amos, PQ, Canada
关键词
Water quality forecasting; Machine learning; Multi-objective optimization; Genetic algorithm; Grey Wolf Optimizer; Chlorophyll-a and Dissolved oxygen; PREDICTION; IMPLEMENTATION; DECOMPOSITION; PERFORMANCE; STREAMFLOW; RESOURCES; SYSTEM;
D O I
10.1016/j.jhydrol.2024.131767
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Accurate short-term forecasting of water quality variables (WQVs) such as dissolved oxygen (DO) and chlorophyll-a (Chl-a) is crucial for the effective management of aquatic resources. This study introduces a robust two-stage optimization-ensembling framework that integrates the Grey Wolf Optimizer (GWO) and the Nondominated Sorting Genetic Algorithm II (NSGA-II) to enhance the forecasting capabilities of machine learning (ML) models. Focusing on Small Prespa Lake, Greece, we implemented an array of diverse ML techniques, including eXtreme Gradient Boosting (XGB), Gradient Boosting Regressor (GBR), Light Gradient-Boosting Machine (LightGBM), and Multilayer Perceptron (MLP). These models were fine-tuned using GWO to optimize their performance over critical WQVs predicted six hours in advance. Our methodology employed rigorous data preprocessing techniques, including lag time feature engineering and principal component analysis (PCA), to handle the high dimensionality of the dataset. Optimal lag times ranging from 6 to 24 hour were evaluated, with the 24-hour lag proving to be the most effective in utilizing historical data to enhance forecasting accuracy. The GWO not only facilitated hyperparameter tuning but also demonstrated a notable improvement (7.6%) in the Kling-Gupta Efficiency (KGE) over conventional randomized search methods. Subsequently, the NSGA-II was utilized for multi-objective optimization, constructing powerful model ensembles that outperformed the individual GWO-optimized models by up to a 7% in KGE. In comparision to a standard genetic algorithm-based ensemble, the NSGA-II ensemble demonstrated superior effectiveness in balancing solution quality. This innovative approach not only establishes a new benchmark in water quality forecasting but also contributes substantially to proactive environmental monitoring and management strategies.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Equity Factor Timing: A Two-Stage Machine Learning Approach
    DiCiurcio, Kevin J.
    Wu, Boyu
    Xu, Fei
    Rodemer, Scott
    Wang, Qian
    JOURNAL OF PORTFOLIO MANAGEMENT, 2024, 50 (03): : 132 - 148
  • [32] A Novel Two-Stage Selection of Feature Subsets in Machine Learning
    Kamala, F. Rosita
    Thangaiah, P. Ranjit Jeba
    ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2019, 9 (03) : 4169 - 4175
  • [33] Two-stage Unsupervised Multiple Kernel Extreme Learning Machine
    Zhao, Guohan
    Xiang, Lingyun
    Zhu, Chengzhang
    Li, Feng
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 800 - 805
  • [34] A novel deep learning ensemble model based on two-stage feature selection and intelligent optimization for water quality prediction
    Liu, Wenli
    Liu, Tianxiang
    Liu, Zihan
    Luo, Hanbin
    Pei, Hanmin
    ENVIRONMENTAL RESEARCH, 2023, 224
  • [35] An enhanced machine learning model for urban air quality forecasting under intense human activities
    Wang, Yelin
    Xia, Feiyang
    Yao, Linlin
    Zhao, Shunyu
    Li, Youjie
    Cai, Yanpeng
    URBAN CLIMATE, 2025, 60
  • [36] A two-stage objective model for video quality evaluation
    Cotton, B
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, PROCEEDINGS - VOL I, 1996, : 893 - 896
  • [37] Predicting Morphological Changes Along a Macrotidal Coastline Using a Two-Stage Machine Learning Model
    Kumar, Pavitra
    Leonardi, Nicoletta
    WATER RESOURCES RESEARCH, 2025, 61 (04)
  • [38] A hybrid WSN based two-stage model for data collection and forecasting water consumption in metropolitan areas
    Faiz, Mohammad
    Daniel, A. K.
    INTERNATIONAL JOURNAL OF NANOTECHNOLOGY, 2023, 20 (5-10) : 851 - 879
  • [39] A Novel Ensemble Machine Learning Model for Oil Production Prediction with Two-Stage Data Preprocessing
    Fan, Zhe
    Liu, Xiusen
    Wang, Zuoqian
    Liu, Pengcheng
    Wang, Yanwei
    PROCESSES, 2024, 12 (03)
  • [40] Machine Learning Enhanced NARMAX Model for Dst Index Forecasting
    Gu, Yuanlin
    Wei, Hua-Liang
    Balikhin, Michael A.
    Boynton, Richard J.
    Walker, Simon N.
    2019 25TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING (ICAC), 2019, : 222 - 227