Prediction of Pseudomonas aeruginosa abundance in drinking water distribution systems using machine learning

被引:0
|
作者
Zhou, Qiaomei [1 ]
Li, Yukang [2 ]
Wang, Min [2 ]
Huang, Jingang [1 ,3 ]
Li, Weishuai [1 ]
Qiu, Shanshan [1 ]
Wang, Haibo [2 ]
机构
[1] Hangzhou Dianzi Univ, Coll Mat & Environm Engn, Hangzhou 310018, Peoples R China
[2] Chinese Acad Sci, Res Ctr Ecoenvironm Sci, Key Lab Drinking Water Sci & Technol, Beijing 100085, Peoples R China
[3] Hangzhou Dianzi Univ, China Austria Belt & Rd Joint Lab Artificial Intel, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; Pseudomonas aeruginosa; Drinking water; Feature selection; Model validation; OPTIMIZATION; SELECTION;
D O I
10.1016/j.psep.2024.11.099
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The detection of Pseudomonas aeruginosa is a challenging but crucial task to ensure the bio-safety of drinking water. The current cultivation and molecular qPCR methods are costly, laborious and time-consuming, leading to inaccuracies and delayed monitoring. In this study, three machine learning (ML) models, including eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Support Vector Regression (SVR), were developed, interpreted, and validated for their ability to predict P. aeruginosa abundance in both urban and rural drinking water distribution systems (DWDS). To ensure the reliability and robustness of ML models, data leakage management for data pre-processing, 5-fold cross-validation and grid search for hyperparameters tuning were utilized during the training phase. To control overfitting issues, feature selection using embedded method was implemented to exclude three low-contributing input variables of oxidation-reduction potential (ORP), total chlorine, and heterotrophic plate counts (HPC). The XGBoost model outperformed RF and SVR models in terms of accuracy and generalizability in predicting P. aeruginosa abundance, achieving training/testing R2 of 0.92/ 0.85 in urban system, and 0.94/0.87 in rural system, respectively. Feature importance analysis revealed that water temperature, dissolved oxygen (DO), residual chlorine, and NO3--N were key variables for the prediction. The validation experiments, by randomly sampling from both urban and rural DWDS, demonstrated acceptable relative errors of 10.77 % and 8.86 %, respectively. Overall, this study provides an applicable ML modeling framework for the accurate and fast prediction of P. aeruginosa abundance in DWDS, potentially reducing laborious experiments in future.
引用
收藏
页码:1050 / 1060
页数:11
相关论文
共 50 条
  • [1] Dominant clone of Pseudomonas aeruginosa in water distribution systems
    Petry-Hansen, H.
    Tewes, M.
    Dannehl, A.
    Flemming, H. C.
    Wingender, J.
    INTERNATIONAL JOURNAL OF MEDICAL MICROBIOLOGY, 2009, 299 : 71 - 71
  • [2] Monitoring of Nitrification in Chloraminated Drinking Water Distribution Systems With Microbiome Bioindicators Using Supervised Machine Learning
    Gomez-Alvarez, Vicente
    Revetta, Randy P.
    FRONTIERS IN MICROBIOLOGY, 2020, 11
  • [3] Virulence phenotype, physicochemical properties, and biofilm formation of Pseudomonas aeruginosa on polyethylene used in drinking water distribution systems
    Ghazlane Zineba
    Latrache Hassan
    Mabrouki Mostafa
    Houari Abdellah
    Timinouni Mohammed
    Mliji El Mostafa
    Water Resources, 2015, 42 : 98 - 107
  • [4] Virulence Phenotype, Physicochemical Properties, and Biofilm Formation of Pseudomonas aeruginosa on Polyethylene Used in Drinking Water Distribution Systems
    Zineba, Ghazlane
    Hassan, Latrache
    Mostafa, Mabrouki
    Abdellah, Houari
    Mohammed, Timinouni
    El Mostafa, Mliji
    WATER RESOURCES, 2015, 42 (01) : 98 - 107
  • [5] Attack detection in water distribution systems using machine learning
    Ramotsoela, Daniel T.
    Hancke, Gerhard P.
    Abu-Mahfouz, Adnan M.
    HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2019, 9
  • [6] Pseudomonas aeruginosa: Assessment of risk from drinking water
    Hardalo, C
    Edberg, SC
    CRITICAL REVIEWS IN MICROBIOLOGY, 1997, 23 (01) : 47 - 75
  • [7] Influence of copper on the viability of Pseudomonas aeruginosa in drinking water
    Dwidjosiswojo, Z.
    Moritz, M. M.
    Flemming, H. C.
    Wingender, J.
    INTERNATIONAL JOURNAL OF MEDICAL MICROBIOLOGY, 2009, 299 : 21 - 21
  • [8] Particle related water quality prediction for drinking water distribution systems
    Ripl, K.
    Lerch, A.
    Uhl, W.
    INTEGRATING WATER SYSTEMS, 2010, : 415 - 421
  • [9] Water Quality Drinking Classification Using Machine Learning
    el Amin, Gasbaoui Mohammed
    Soumia, Benkrama
    Mostefa, Bendjima
    PROGRAM OF THE 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND AUTOMATIC CONTROL, ICEEAC 2024, 2024,
  • [10] Drinking water potability prediction using machine learning approaches: a case study of Indian rivers
    Ainapure, Bharati
    Baheti, Nidhi
    Buch, Jyot
    Appasani, Bhargav
    Jha, Amitkumar V.
    Srinivasulu, Avireni
    WATER PRACTICE AND TECHNOLOGY, 2023, 18 (12) : 3004 - 3020