Comparison of machine learning methods for estimating case fatality ratios: An Ebola outbreak simulation study

被引:1
|
作者
Forna, Alpha [1 ]
Dorigatti, Ilaria [2 ]
Nouvellet, Pierre [2 ,3 ]
Donnelly, Christl A. [2 ,4 ]
机构
[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC, Canada
[2] Imperial Coll London, MRC, Dept Infect Dis Epidemiol, Ctr Global Infect Dis Anal, London, England
[3] Univ Sussex, Sch Life Sci, Brighton, E Sussex, England
[4] Univ Oxford, Dept Stat, Oxford, England
来源
PLOS ONE | 2021年 / 16卷 / 09期
基金
英国惠康基金; 英国医学研究理事会;
关键词
MISSING DATA; DISEASE;
D O I
10.1371/journal.pone.0257005
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Machine learning (ML) algorithms are now increasingly used in infectious disease epidemiology. Epidemiologists should understand how ML algorithms behave within the context of outbreak data where missingness of data is almost ubiquitous. Methods Using simulated data, we use a ML algorithmic framework to evaluate data imputation performance and the resulting case fatality ratio (CFR) estimates, focusing on the scale and type of data missingness (i.e., missing completely at random-MCAR, missing at random-MAR, or missing not at random-MNAR). Results Across ML methods, dataset sizes and proportions of training data used, the area under the receiver operating characteristic curve decreased by 7% (median, range: 1%-16%) when missingness was increased from 10% to 40%. Overall reduction in CFR bias for MAR across methods, proportion of missingness, outbreak size and proportion of training data was 0.5% (median, range: 0%-11%). Conclusion ML methods could reduce bias and increase the precision in CFR estimates at low levels of missingness. However, no method is robust to high percentages of missingness. Thus, a datacentric approach is recommended in outbreak settings-patient survival outcome data should be prioritised for collection and random-sample follow-ups should be implemented to ascertain missing outcomes.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Comparison of machine learning methods for estimating case fatality ratios: An Ebola outbreak simulation study (vol 16, e0257005, 2021)
    Forna, Alpha
    Dorigatti, I
    Nouvellet, P.
    Donnelly, C. A.
    PLOS ONE, 2024, 19 (12):
  • [2] HETEROGENEITIES IN THE CASE FATALITY RATE IN THE EBOLA OUTBREAK IN WEST AFRICA
    Garske, Tini
    AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2015, 93 (04): : 1 - 1
  • [3] BIAS ADJUSTMENT OF CASE FATALITY RATE ESTIMATES IN THE EBOLA OUTBREAK IN WEST AFRICA
    Garske, Tini
    AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2017, 95 (05): : 210 - 210
  • [4] Comparison of Machine Learning Methods for Estimating Energy Consumption in Buildings
    Mocanu, Elena
    Nguyen, Phuong H.
    Gibescu, Madeleine
    Kling, Wil L.
    2014 INTERNATIONAL CONFERENCE ON PROBABILISTIC METHODS APPLIED TO POWER SYSTEMS (PMAPS), 2014,
  • [5] Estimates of Ebola Virus Case-Fatality Ratio in the 2014 West African Outbreak
    Focosi, Daniele
    Maggi, Fabrizio
    CLINICAL INFECTIOUS DISEASES, 2015, 60 (05) : 829 - 829
  • [6] ESTIMATING CASE FATALITY RATIOS FROM INFECTIOUS DISEASE SURVEILLANCE DATA
    Reich, N. G.
    Lessler, J.
    Brookmeyer, R.
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2010, 171 : S138 - S138
  • [7] Heterogeneities in the case fatality ratio in the West African Ebola outbreak 2013-2016
    Garske, Tini
    Cori, Anne
    Ariyarajah, Archchun
    Blake, Isobel M.
    Dorigatti, Ilaria
    Eckmanns, Tim
    Fraser, Christophe
    Hinsley, Wes
    Jombart, Thibaut
    Mills, Harriet L.
    Nedjati-Gilani, Gemma
    Newton, Emily
    Nouvellet, Pierre
    Perkins, Devin
    Riley, Steven
    Schumacher, Dirk
    Shah, Anita
    Van Kerkhove, Maria D.
    Dye, Christopher
    Ferguson, Neil M.
    Donnelly, Christl A.
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2017, 372 (1721)
  • [8] A comparison of two methods for estimating prevalence ratios
    Martin R Petersen
    James A Deddens
    BMC Medical Research Methodology, 8
  • [9] Comparison of different machine learning methods for estimating compressive strength of mortars
    Caliskan, Abidin
    Demirhan, Serhat
    Tekin, Ramazan
    CONSTRUCTION AND BUILDING MATERIALS, 2022, 335
  • [10] Spatiotemporal variability in case fatality ratios for the 2013-2016 Ebola epidemic in West Africa
    Forna, Alpha
    Dorigatti, Ilaria
    Nouvellet, Pierre
    Donnelly, Christl A.
    INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, 2020, 93 : 48 - 55