Performance of Machine Learning, Artificial Neural Network (ANN), and stacked ensemble models in predicting Water Quality Index (WQI) from surface water quality parameters, climatic and land use data

被引:0
|
作者
Satish, Nagalapalli [1 ]
Anmala, Jagadeesh [1 ]
Varma, Murari R. R. [1 ]
Rajitha, K. [1 ]
机构
[1] Birla Inst Technol & Sci Pilani, Dept Civil Engn, Hyderabad Campus, Hyderabad 500078, Telangana, India
关键词
Water Quality Index; Machine Learning; Stacked Artificial Neural Networks; Land Use and Land Cover; Climatic factors; RIVER; INDICATORS; BASIN; TOOL;
D O I
10.1016/j.psep.2024.10.054
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Assessing water quality is essential for managing freshwater resources, safeguarding ecosystems, and guaranteeing public health. Traditional water quality assessment methods suffer from seasonal sampling, multi- parameter requirements, and labor-intensive sampling processes, which are major constraints for the frequent monitoring of vast river basins. To overcome this issue, the study modeled the remote sensing-based climatic and land use parameters with Principal Component Analysis (PCA) to leverage Artificial Neural Networks (ANN) and machine learning (ML) algorithms to predict the Water Quality Index (WQI). The Weighted Arithmetic Water Quality Index (WAWQI) method was used to calculate the WQI of the Godavari River Basin for the available 19 stream water quality parameters (SWQPs). Further, PCA was applied to reduce the dimensionality of the parameters from 19 to 6. These results led to the development of two modeling methods to predict the WQI. In the first method, the correlation-based model was developed to predict WQI by evaluating six SWQPs. The second method, the causal-effect model, uses land use and meteorological factors to determine WQI using causality. Using advanced AutoML techniques, the initial pool of 40 ML models was meticulously evaluated and refined, culminating in the selection of the top three exemplary models such as Extreme Gradient Boosting (XGB), Extra Trees (ET), and Random Forest (RF). In both methods, XGB models show better prediction, with the coefficient of determination (R2) value of 0.95 during training and 0.83 during testing in method one. Whereas in the second method, R2 of 0.93 in training and 0.80 in testing are obtained. Further, XGB, ET, and ANN outputs were stacked with each model to enhance these results in both methods. Among these three stacked models, the stacked ANN_ML model performed better compared to stacked XGB_ML and stacked ET_ML. In the first method, the stacked ANN_ML model predicts R2 values of 0.95 and 0.91 for training and testing. In the second method, 0.95 and 0.90 for training and testing are obtained using stacked ANN_ML model. These findings emphasize the stacked model prediction ability to capture nonlinear relationships in the parameters and the novel approach of land use and climate parameters based WQI prediction, which replace the laborious, time-consuming SWQP measurements.
引用
收藏
页码:177 / 195
页数:19
相关论文
共 50 条
  • [31] A(QUA)LITY: Water Quality Prediction for Indian States with Varied Parameters Using Ensemble Machine Learning Models
    Singh, Shivam Kumar
    Sindhu, C.
    Mondal, Aishwarya
    Justin, Ashwin Thejus
    Parveen, H. Summia
    Rao, Akshath
    SMART TRENDS IN COMPUTING AND COMMUNICATIONS, VOL 5, SMARTCOM 2024, 2024, 949 : 417 - 428
  • [32] Predictive modeling of water quality index (WQI) classes in Indian rivers: Insights from the application of multiple Machine Learning (ML) models on a decennial dataset
    Singh, Shailja
    Das, Anirban
    Sharma, Paawan
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2024, 38 (08) : 3221 - 3238
  • [33] Investigating machine learning models in predicting lake water quality parameters as a 3-year moving average
    Faezeh Gorgan-Mohammadi
    Taher Rajaee
    Mohammad Zounemat-Kermani
    Environmental Science and Pollution Research, 2023, 30 : 63839 - 63863
  • [34] Investigating machine learning models in predicting lake water quality parameters as a 3-year moving average
    Gorgan-Mohammadi, Faezeh
    Rajaee, Taher
    Zounemat-Kermani, Mohammad
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2023, 30 (23) : 63839 - 63863
  • [35] Secchi Disk Depth Estimation from Water Quality Parameters: Artificial Neural Network versus Multiple Linear Regression Models?
    Heddam S.
    Environmental Processes, 2016, 3 (2) : 525 - 536
  • [36] Comparison of Multiple Linear Regression and Artificial Neural Network Models in retrieving Water Quality Parameters using Remotely Sensed Data: Lake Victoria (Tanzanian) Water
    Aroko, Risper
    Deus, Dorothea
    Ngereja, Zakaria
    SOUTH AFRICAN JOURNAL OF GEOMATICS, 2025, 14 (01): : 167 - 190
  • [37] Performance of machine learning methods in predicting water quality index based on irregular data set: application on Illizi region (Algerian southeast)
    Kouadri, Saber
    Elbeltagi, Ahmed
    Islam, Abu Reza Md Towfiqul
    Kateb, Samir
    APPLIED WATER SCIENCE, 2021, 11 (12)
  • [38] Performance of machine learning methods in predicting water quality index based on irregular data set: application on Illizi region (Algerian southeast)
    Saber Kouadri
    Ahmed Elbeltagi
    Abu Reza Md. Towfiqul Islam
    Samir Kateb
    Applied Water Science, 2021, 11
  • [39] Artificial neural network and mathematical approach for estimation of surface water quality parameters (case study: California, USA)
    Salami, Esmail Shahid
    Salari, Marjan
    Rastegar, Mohamad
    Sheibani, Solmaz Nikbakht
    Ehteshami, Majid
    DESALINATION AND WATER TREATMENT, 2021, 213 : 75 - 83
  • [40] Predicting the capability of carboxylated cellulose nanowhiskers for the remediation of copper from water using response surface methodology (RSM) and artificial neural network (ANN) models
    Hamid, Hazren A.
    Jenidi, Youla
    Thielemans, Wim
    Somerfield, Christopher
    Gomes, Rachel L.
    INDUSTRIAL CROPS AND PRODUCTS, 2016, 93 : 108 - 120