Stream water quality prediction using boosted regression tree and random forest models

被引:60
|
作者
Alnahit, Ali O. [2 ]
Mishra, Ashok K. [1 ]
Khan, Abdul A. [1 ]
机构
[1] Clemson Univ, Glenn Dept Civil Engn, Clemson, SC 29634 USA
[2] King Saud Univ, Dept Civil Engn, Riyadh, Saudi Arabia
关键词
Water quality; Machine learning algorithms; Random forests; Boosted regression trees; SOIL ORGANIC-CARBON; LAND-USE; MACROINVERTEBRATE ASSEMBLAGES; LINEAR-REGRESSION; RIVER-BASIN; FRESH-WATER; MULTIPLE; COVER; CLASSIFICATION; PHOSPHORUS;
D O I
10.1007/s00477-021-02152-4
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Reliable water quality prediction can improve environmental flow monitoring and the sustainability of the stream ecosystem. In this study, we compared two machine learning methods to predict water quality parameters, such as total nitrogen (TN), total phosphorus (TP), and turbidity (TUR), for 97 watersheds located in the Southeast Atlantic region of the USA. The modeling framework incorporates multiple climate and watershed variables (characteristics) that often control the water quality indicators in different landscapes. Three techniques, such as stepwise regression (SR), Least Absolute Shrinkage and Selection Operator (LASSO), and genetic algorithm (GA), are implemented to identify appropriate predictors out of 28 climate and catchment-related variables. The selected predictors were then used to develop the Random Forest (RF) and Boosted regression tree (BRT) models for water quality predictions in selected watersheds. The results highlighted that while both algorithms provided reasonable results (based on statistical metrics), the RF algorithm was easier to train and robust to model overfitting. Partial dependence plots highlighted the complex and nonlinear relationships between the individual predictors and the water quality indicators. The thresholds obtained from partial dependence plots showed that the median values of total nitrogen (TN) and total phosphorus (TP) in streams increase significantly when the percentage of urban and agricultural lands is above 40% and 43% of the watershed area, respectively. Furthermore, when soil hydraulic conductivity increases, the reduction in runoff results in decreased Turbidity levels in streams. Therefore, identifying the key watershed characteristics and their critical thresholds can help watershed managers create appropriate regulations for managing and sustaining healthy stream ecosystems. Besides, the forecasting models can improve water quality predictions in ungauged watersheds.
引用
收藏
页码:2661 / 2680
页数:20
相关论文
共 50 条
  • [21] Prediction of Quality of Water According to a Random Forest Classifier
    Alomani, Shahd Maadi
    Alhawiti, Najd Ibrahim
    Alhakamy, A'aeshah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (06) : 892 - 899
  • [22] Classification tree and random forest based prediction models on molecular autofluorescence
    Tu, Yi-Shu
    Lin, Tze-Hao
    Tseng, Yufeng J.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2013, 245
  • [23] Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology
    Mistry, Pritesh
    Neagu, Daniel
    Trundle, Paul R.
    Vessey, Jonathan D.
    SOFT COMPUTING, 2016, 20 (08) : 2967 - 2979
  • [24] Using random forest and decision tree models for a new vehicle prediction approach in computational toxicology
    Pritesh Mistry
    Daniel Neagu
    Paul R. Trundle
    Jonathan D. Vessey
    Soft Computing, 2016, 20 : 2967 - 2979
  • [25] Prediction of Irrigation Water Quality Indices Using Random Committee, Discretization Regression, REPTree, and Additive Regression
    Mustafa Al-Mukhtar
    Aman Srivastava
    Leena Khadke
    Tariq Al-Musawi
    Ahmed Elbeltagi
    Water Resources Management, 2024, 38 : 343 - 368
  • [26] Road Crashes Analysis and Prediction using Gradient Boosted and Random Forest Trees
    Elyassami, Sanaa
    Hamid, Yasir
    Habuza, Tetiana
    2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20), 2020, : 520 - 525
  • [27] Prediction of Irrigation Water Quality Indices Using Random Committee, Discretization Regression, REPTree, and Additive Regression
    Al-Mukhtar, Mustafa
    Srivastava, Aman
    Khadke, Leena
    Al-Musawi, Tariq
    Elbeltagi, Ahmed
    WATER RESOURCES MANAGEMENT, 2024, 38 (01) : 343 - 368
  • [28] Wind Turbine Noise Prediction Using Random Forest Regression
    Iannace, Gino
    Ciaburro, Giuseppe
    Trematerra, Amelia
    MACHINES, 2019, 7 (04)
  • [29] Restaurant Queuing Time Prediction Using Random Forest Regression
    Xue, Yijia
    Zhang, Xiang
    2022 12TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS (ICPRS), 2022,
  • [30] Novel ensembles of COPRAS multi-criteria decision-making with logistic regression, boosted regression tree, and random forest for spatial prediction of gully erosion susceptibility
    Arabameri, Alireza
    Yamani, Mojtaba
    Pradhan, Biswajeet
    Melesse, Assefa
    Shirani, Kourosh
    Dieu Tien Bui
    SCIENCE OF THE TOTAL ENVIRONMENT, 2019, 688 : 903 - 916