Addressing gaps in data on drinking water quality through data integration and machine learning: evidence from Ethiopia

被引:0
|
作者
Alemayehu A. Ambel
Robert Bain
Tefera Bekele Degefu
Ayca Donmez
Richard Johnston
Tom Slaymaker
机构
[1] Development Data Group,Division of Data, Analysis
[2] World Bank,Department of Environment
[3] Planning and Monitoring,undefined
[4] UNICEF,undefined
[5] Climate Change and Health,undefined
[6] WHO,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Monitoring access to safely managed drinking water services requires information on water quality. An increasing number of countries have integrated water quality testing in household surveys however it is not anticipated that such tests will be included in all future surveys. Using water testing data from the 2016 Ethiopia Socio-Economic Survey (ESS) we developed predictive models to identify households using contaminated (≥1 E. coli per 100 mL) drinking water sources based on common machine learning classification algorithms. These models were then applied to the 2013–2014 and 2018–2019 waves of the ESS that did not include water testing. The highest performing model achieved good accuracy (88.5%; 95% CI 86.3%, 90.6%) and discrimination (AUC 0.91; 95% CI 0.89, 0.94). The use of demographic, socioeconomic, and geospatial variables provided comparable results to that of the full features model whereas a model based exclusively on water source type performed poorly. Drinking water quality at the point of collection can be predicted from demographic, socioeconomic, and geospatial variables that are often available in household surveys.
引用
收藏
相关论文
共 50 条
  • [41] Automated scraping and analyses of drinking water quality data
    Saal, Leon
    Ruhl, Aki Sebastian
    INTERNATIONAL JOURNAL OF HYGIENE AND ENVIRONMENTAL HEALTH, 2024, 255
  • [42] Distributed data and system integration through machine understanding
    Liu, SP
    Xiao, GZ
    Yin, QW
    DCABES 2002, PROCEEDING, 2002, : 215 - 217
  • [43] Integration of Survey Data in R Based on Machine Learning
    Spaziani, Mattia
    Frattarola, Doriana
    D'Orazio, Marcello
    ROMANIAN STATISTICAL REVIEW, 2019, (03) : 5 - 16
  • [44] Machine learning for data integration in human gut microbiome
    Peishun Li
    Hao Luo
    Boyang Ji
    Jens Nielsen
    Microbial Cell Factories, 21
  • [45] Data Integration Challenges for Machine Learning in Precision Medicine
    Martinez-Garcia, Mireya
    Hernandez-Lemus, Enrique
    FRONTIERS IN MEDICINE, 2022, 8
  • [46] Machine learning for data integration in human gut microbiome
    Li, Peishun
    Luo, Hao
    Ji, Boyang
    Nielsen, Jens
    MICROBIAL CELL FACTORIES, 2022, 21 (01)
  • [47] An Integration of Extreme Learning Machine for Classification of Big Data
    Zhou, Guanwu
    Zhao, Yulong
    Xu, Wenju
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND COMPUTER APPLICATIONS (ICSA 2013), 2013, 92 : 81 - 86
  • [48] Efficient Data-Driven Machine Learning Models for Water Quality Prediction
    Dritsas, Elias
    Trigka, Maria
    COMPUTATION, 2023, 11 (02)
  • [49] The Potential of Big Data and Machine Learning for Ground Water Quality Assessment and Prediction
    Rajeev, Athira
    Shah, Rehan
    Shah, Parin
    Shah, Manan
    Nanavaty, Rudraksh
    ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING, 2025, 32 (02) : 927 - 941
  • [50] Learning from data through the integration of qualitative models and fuzzy systems
    Bellazzi, R
    Ironi, L
    Guglielmann, R
    Stefanelli, M
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 1997, 1211 : 501 - 512