Stream water quality prediction using boosted regression tree and random forest models

被引:0
|
作者
Ali O. Alnahit
Ashok K. Mishra
Abdul A. Khan
机构
[1] Clemson University,Glenn Department of Civil Engineering
[2] King Saud University,Department of Civil Engineering
关键词
Water quality; Machine learning algorithms; Random forests; Boosted regression trees;
D O I
暂无
中图分类号
学科分类号
摘要
Reliable water quality prediction can improve environmental flow monitoring and the sustainability of the stream ecosystem. In this study, we compared two machine learning methods to predict water quality parameters, such as total nitrogen (TN), total phosphorus (TP), and turbidity (TUR), for 97 watersheds located in the Southeast Atlantic region of the USA. The modeling framework incorporates multiple climate and watershed variables (characteristics) that often control the water quality indicators in different landscapes. Three techniques, such as stepwise regression (SR), Least Absolute Shrinkage and Selection Operator (LASSO), and genetic algorithm (GA), are implemented to identify appropriate predictors out of 28 climate and catchment-related variables. The selected predictors were then used to develop the Random Forest (RF) and Boosted regression tree (BRT) models for water quality predictions in selected watersheds. The results highlighted that while both algorithms provided reasonable results (based on statistical metrics), the RF algorithm was easier to train and robust to model overfitting. Partial dependence plots highlighted the complex and nonlinear relationships between the individual predictors and the water quality indicators. The thresholds obtained from partial dependence plots showed that the median values of total nitrogen (TN) and total phosphorus (TP) in streams increase significantly when the percentage of urban and agricultural lands is above 40% and 43% of the watershed area, respectively. Furthermore, when soil hydraulic conductivity increases, the reduction in runoff results in decreased Turbidity levels in streams. Therefore, identifying the key watershed characteristics and their critical thresholds can help watershed managers create appropriate regulations for managing and sustaining healthy stream ecosystems. Besides, the forecasting models can improve water quality predictions in ungauged watersheds.
引用
收藏
页码:2661 / 2680
页数:19
相关论文
共 50 条
  • [1] Stream water quality prediction using boosted regression tree and random forest models
    Alnahit, Ali O.
    Mishra, Ashok K.
    Khan, Abdul A.
    [J]. STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2022, 36 (09) : 2661 - 2680
  • [2] Landslide Susceptibility Mapping Based on Random Forest and Boosted Regression Tree Models, and a Comparison of Their Performance
    Park, Soyoung
    Kim, Jinsoo
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (05):
  • [3] GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran
    Naghibi, Seyed Amir
    Pourghasemi, Hamid Reza
    Dixon, Barnali
    [J]. ENVIRONMENTAL MONITORING AND ASSESSMENT, 2016, 188 (01) : 1 - 27
  • [4] GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran
    Seyed Amir Naghibi
    Hamid Reza Pourghasemi
    Barnali Dixon
    [J]. Environmental Monitoring and Assessment, 2016, 188
  • [5] Uncertainty in the spatial prediction of soil texture Comparison of regression tree and Random Forest models
    Liess, Mareike
    Glaser, Bruno
    Huwe, Bernd
    [J]. GEODERMA, 2012, 170 : 70 - 79
  • [6] Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem
    Yang, Ren-Min
    Zhang, Gan-Lin
    Liu, Feng
    Lu, Yuan-Yuan
    Yang, Fan
    Yang, Fei
    Yang, Min
    Zhao, Yu-Guo
    Li, De-Cheng
    [J]. ECOLOGICAL INDICATORS, 2016, 60 : 870 - 878
  • [7] Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea
    Lee, Sunmin
    Kim, Jeong-Cheol
    Jung, Hyung-Sup
    Lee, Moung Jin
    Lee, Saro
    [J]. GEOMATICS NATURAL HAZARDS & RISK, 2017, 8 (02) : 1185 - 1203
  • [8] Predicting drivers of nuisance macrophyte cover in a regulated California stream using boosted regression tree models
    Zefferman, Emily P.
    Harris, David J.
    [J]. JOURNAL OF AQUATIC PLANT MANAGEMENT, 2016, 54 : 78 - 86
  • [9] Approximating Prediction Uncertainty for Random Forest Regression Models
    Coulston, John W.
    Blinn, Christine E.
    Thomas, Valerie A.
    Wynne, Randolph H.
    [J]. PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2016, 82 (03): : 189 - 197
  • [10] Forest stand susceptibility mapping during harvesting using logistic regression and boosted regression tree machine learning models
    Shabani, Saeid
    Pourghasemi, Hamid Reza
    Blaschke, Thomas
    [J]. GLOBAL ECOLOGY AND CONSERVATION, 2020, 22