Addressing gaps in data on drinking water quality through data integration and machine learning: evidence from Ethiopia

被引:0
|
作者
Alemayehu A. Ambel
Robert Bain
Tefera Bekele Degefu
Ayca Donmez
Richard Johnston
Tom Slaymaker
机构
[1] Development Data Group,Division of Data, Analysis
[2] World Bank,Department of Environment
[3] Planning and Monitoring,undefined
[4] UNICEF,undefined
[5] Climate Change and Health,undefined
[6] WHO,undefined
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Monitoring access to safely managed drinking water services requires information on water quality. An increasing number of countries have integrated water quality testing in household surveys however it is not anticipated that such tests will be included in all future surveys. Using water testing data from the 2016 Ethiopia Socio-Economic Survey (ESS) we developed predictive models to identify households using contaminated (≥1 E. coli per 100 mL) drinking water sources based on common machine learning classification algorithms. These models were then applied to the 2013–2014 and 2018–2019 waves of the ESS that did not include water testing. The highest performing model achieved good accuracy (88.5%; 95% CI 86.3%, 90.6%) and discrimination (AUC 0.91; 95% CI 0.89, 0.94). The use of demographic, socioeconomic, and geospatial variables provided comparable results to that of the full features model whereas a model based exclusively on water source type performed poorly. Drinking water quality at the point of collection can be predicted from demographic, socioeconomic, and geospatial variables that are often available in household surveys.
引用
收藏
相关论文
共 50 条
  • [1] Addressing gaps in data on drinking water quality through data integration and machine learning: evidence from Ethiopia
    Ambel, Alemayehu A.
    Bain, Robert
    Degefu, Tefera Bekele
    Donmez, Ayca
    Johnston, Richard
    Slaymaker, Tom
    NPJ CLEAN WATER, 2023, 6 (01)
  • [2] Addressing Data Gaps To Improve Evidence On AT Outcomes - An Update From Australia
    Steel, Emily J.
    ASSISTIVE TECHNOLOGY, 2021, 33 (03) : 151 - 151
  • [3] A review of machine learning and big data applications in addressing ecosystem service research gaps
    Manley, Kyle
    Nyelele, Charity
    Egoh, Benis N.
    ECOSYSTEM SERVICES, 2022, 57
  • [4] A survey of machine learning methods applied to anomaly detection on drinking-water quality data
    Dogo, Eustace M.
    Nwulu, Nnamdi, I
    Twala, Bhekisipho
    Aigbavboa, Clinton
    URBAN WATER JOURNAL, 2019, 16 (03) : 235 - 248
  • [5] Data Integration in Machine Learning
    Li, Yifeng
    Ngom, Alioune
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1665 - 1671
  • [6] The effect of forest land use on the cost of drinking water supply: machine learning evidence from South African data
    Gelo, Dambala
    Turpie, Jane
    JOURNAL OF ENVIRONMENTAL ECONOMICS AND POLICY, 2022, 11 (04) : 361 - 374
  • [7] Data Integration using Machine Learning
    Birgersson, Marcus
    Hansson, Gustav
    Franke, Ulrik
    2016 IEEE 20TH INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING WORKSHOP (EDOCW), 2016, : 313 - 322
  • [8] Machine Learning for Medical Data Integration
    Mueller, Armin
    Christmann, Lara-Sophie
    Kohler, Severin
    Eils, Roland
    Prasser, Fabian
    CARING IS SHARING-EXPLOITING THE VALUE IN DATA FOR HEALTH AND INNOVATION-PROCEEDINGS OF MIE 2023, 2023, 302 : 691 - 695
  • [9] Quality of Data in Machine Learning
    Kariluoto, Antti
    Kultanen, Joni
    Soininen, Jukka
    Parnanen, Arto
    Abrahamsson, Pekka
    2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 216 - 221
  • [10] Addressing Microplastic Environmental Data Gaps Through Undergraduate Research
    Kryl, Michelle
    Lewandoski, Ashlee
    DiBlasio, Grace
    Howard, Ethan
    Jeznach, Lillian
    ENVIRONMENTAL ENGINEERING SCIENCE, 2024, 41 (11) : 499 - 507