River Water Salinity Prediction Using Hybrid Machine Learning Models

被引:63
|
作者
Melesse, Assefa M. [1 ]
Khosravi, Khabat [2 ]
Tiefenbacher, John P. [3 ]
Heddam, Salim [4 ]
Kim, Sungwon [5 ]
Mosavi, Amir [6 ,7 ,8 ,9 ]
Pham, Binh Thai [10 ]
机构
[1] Florida Int Univ, Dept Earth & Environm, Miami, FL 33199 USA
[2] Sari Agr & Nat Resources Univ, Dept Watershed Management, Sari 4818168984, Iran
[3] Texas State Univ, Dept Geog, San Marcos, TX 78666 USA
[4] Univ 20 Aout 1955, Lab Res Biodivers Interact Ecosyst & Biotechnol, Route El Hadaik,BP 26, Skikda 21000, Algeria
[5] Dongyang Univ, Dept Railroad Construct & Safety Engn, Yeongju 36040, South Korea
[6] Tech Univ Dresden, Fac Civil Engn, D-01069 Dresden, Germany
[7] Norwegian Univ Life Sci, Sch Business & Econ, N-1430 As, Norway
[8] Thuringian Inst Sustainabil & Climate Protect, D-07743 Jena, Germany
[9] Obuda Univ, Inst Automat, H-1034 Budapest, Hungary
[10] Duy Tan Univ, Inst Res & Dev, Da Nang 550000, Vietnam
关键词
water salinity; machine learning; bagging; random forest; random subspace; data science; hydrological model; big data; hydroinformatics; electrical conductivity; ARTIFICIAL NEURAL-NETWORKS; SUPPORT VECTOR MACHINES; RANDOM SUBSPACE ENSEMBLES; FUZZY INFERENCE SYSTEM; DATA MINING MODELS; DISSOLVED-OXYGEN; ELECTRICAL-CONDUCTIVITY; REGRESSION; GROUNDWATER; PERFORMANCE;
D O I
10.3390/w12102951
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Electrical conductivity (EC), one of the most widely used indices for water quality assessment, has been applied to predict the salinity of the Babol-Rood River, the greatest source of irrigation water in northern Iran. This study uses two individual-M5 Prime (M5P) and random forest (RF)-and eight novel hybrid algorithms-bagging-M5P, bagging-RF, random subspace (RS)-M5P, RS-RF, random committee (RC)-M5P, RC-RF, additive regression (AR)-M5P, and AR-RF-to predict EC. Thirty-six years of observations collected by the Mazandaran Regional Water Authority were randomly divided into two sets: 70% from the period 1980 to 2008 was used as model-training data and 30% from 2009 to 2016 was used as testing data to validate the models. Several water quality variables-pH, HCO3-, Cl-, SO42-, Na+, Mg2+, Ca2+, river discharge (Q), and total dissolved solids (TDS)-were modeling inputs. Using EC and the correlation coefficients (CC) of the water quality variables, a set of nine input combinations were established. TDS, the most effective input variable, had the highest EC-CC (r = 0.91), and it was also determined to be the most important input variable among the input combinations. All models were trained and each model's prediction power was evaluated with the testing data. Several quantitative criteria and visual comparisons were used to evaluate modeling capabilities. Results indicate that, in most cases, hybrid algorithms enhance individual algorithms' predictive powers. The AR algorithm enhanced both M5P and RF predictions better than bagging, RS, and RC. M5P performed better than RF. Further, AR-M5P outperformed all other algorithms (R-2 = 0.995, RMSE = 8.90 mu s/cm, MAE = 6.20 mu s/cm, NSE = 0.994 and PBIAS = -0.042). The hybridization of machine learning methods has significantly improved model performance to capture maximum salinity values, which is essential in water resource management.
引用
收藏
页码:1 / 21
页数:21
相关论文
共 50 条
  • [41] Flash flood susceptibility prediction mapping for a road network using hybrid machine learning models
    Hang Ha
    Chinh Luu
    Quynh Duy Bui
    Duy-Hoa Pham
    Tung Hoang
    Viet-Phuong Nguyen
    Minh Tuan Vu
    Binh Thai Pham
    [J]. Natural Hazards, 2021, 109 : 1247 - 1270
  • [42] Flash flood susceptibility prediction mapping for a road network using hybrid machine learning models
    Ha, Hang
    Luu, Chinh
    Bui, Quynh Duy
    Pham, Duy-Hoa
    Hoang, Tung
    Nguyen, Viet-Phuong
    Vu, Minh Tuan
    Pham, Binh Thai
    [J]. NATURAL HAZARDS, 2021, 109 (01) : 1247 - 1270
  • [43] Cardiovascular Disease Prediction Using Machine Learning Models
    Nikam, Atharv
    Bhandari, Sanket
    Mhaske, Aditya
    Mantri, Shamla
    [J]. 2020 IEEE PUNE SECTION INTERNATIONAL CONFERENCE (PUNECON), 2020, : 22 - 27
  • [44] Genetic Programming and Gaussian Process Regression Models for Groundwater Salinity Prediction: Machine Learning for Sustainable Water Resources Management
    Lal, Alvin
    Datta, Bithin
    [J]. 2018 IEEE CONFERENCE ON TECHNOLOGIES FOR SUSTAINABILITY (SUSTECH), 2018, : 225 - 231
  • [45] Bug Prediction of SystemC Models Using Machine Learning
    Efendioglu, Mustafa
    Sen, Alper
    Koroglu, Yavuz
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (03) : 419 - 429
  • [46] Spatial prediction of groundwater salinity in multiple aquifers of the Mekong Delta region using explainable machine learning models
    Jeong, Heewon
    Abbas, Ather
    Kim, Hyo Gyeom
    Hoan, Hoang Van
    Tuan, Pham Van
    Long, Phan Thang
    Lee, Eunhee
    Cho, Kyung Hwa
    [J]. WATER RESEARCH, 2024, 266
  • [47] Breast Cancer Prediction using Machine Learning Models
    Iparraguirre-Villanueva, Orlando
    Epifania-Huerta, Andres
    Torres-Ceclen, Carmen
    Ruiz-Alvarado, John
    Cabanillas-Carbonell, Michael
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (02) : 610 - 620
  • [48] Prediction of hepatitis E using machine learning models
    Guo, Yanhui
    Feng, Yi
    Qu, Fuli
    Zhang, Li
    Yan, Bingyu
    Lv, Jingjing
    [J]. PLOS ONE, 2020, 15 (09):
  • [49] Prediction of Frailty Grade Using Machine Learning Models
    Erdas, Cagatay Berke
    Olcer, Didem
    [J]. 2022 MEDICAL TECHNOLOGIES CONGRESS (TIPTEKNO'22), 2022,
  • [50] Cocrystal Prediction Using Machine Learning Models and Descriptors
    Mswahili, Medard Edmund
    Lee, Min-Jeong
    Martin, Gati Lother
    Kim, Junghyun
    Kim, Paul
    Choi, Guang J.
    Jeong, Young-Seob
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (03): : 1 - 12