Comparison of machine learning methods for estimating case fatality ratios: An Ebola outbreak simulation study

被引:1
|
作者
Forna, Alpha [1 ]
Dorigatti, Ilaria [2 ]
Nouvellet, Pierre [2 ,3 ]
Donnelly, Christl A. [2 ,4 ]
机构
[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC, Canada
[2] Imperial Coll London, MRC, Dept Infect Dis Epidemiol, Ctr Global Infect Dis Anal, London, England
[3] Univ Sussex, Sch Life Sci, Brighton, E Sussex, England
[4] Univ Oxford, Dept Stat, Oxford, England
来源
PLOS ONE | 2021年 / 16卷 / 09期
基金
英国惠康基金; 英国医学研究理事会;
关键词
MISSING DATA; DISEASE;
D O I
10.1371/journal.pone.0257005
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Machine learning (ML) algorithms are now increasingly used in infectious disease epidemiology. Epidemiologists should understand how ML algorithms behave within the context of outbreak data where missingness of data is almost ubiquitous. Methods Using simulated data, we use a ML algorithmic framework to evaluate data imputation performance and the resulting case fatality ratio (CFR) estimates, focusing on the scale and type of data missingness (i.e., missing completely at random-MCAR, missing at random-MAR, or missing not at random-MNAR). Results Across ML methods, dataset sizes and proportions of training data used, the area under the receiver operating characteristic curve decreased by 7% (median, range: 1%-16%) when missingness was increased from 10% to 40%. Overall reduction in CFR bias for MAR across methods, proportion of missingness, outbreak size and proportion of training data was 0.5% (median, range: 0%-11%). Conclusion ML methods could reduce bias and increase the precision in CFR estimates at low levels of missingness. However, no method is robust to high percentages of missingness. Thus, a datacentric approach is recommended in outbreak settings-patient survival outcome data should be prioritised for collection and random-sample follow-ups should be implemented to ascertain missing outcomes.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Learning Machine Learning: A Case Study
    Lavesson, Niklas
    IEEE TRANSACTIONS ON EDUCATION, 2010, 53 (04) : 672 - 676
  • [42] Travel Time Prediction: Comparison of Machine Learning Algorithms in a Case Study
    Goudarzi, Forough
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 1404 - 1407
  • [43] COMPARISON OF 3 METHODS OF ESTIMATING ODDS RATIOS FROM A JOB EXPOSURE MATRIX IN OCCUPATIONAL CASE-CONTROL STUDIES
    BOUYER, J
    HEMON, D
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 1993, 137 (04) : 472 - 481
  • [44] Comparison of machine learning methods for estimating permeability and porosity of oil reservoirs via petro-physical logs
    Mohammad Ali Ahmadi
    Zhangxing Chen
    Petroleum, 2019, 5 (03) : 271 - 284
  • [45] Comparison of Machine Learning and Traditional Statistical Methods in Debris Flow Susceptibility Assessment: A Case Study of Changping District, Beijing
    Gu, Feifan
    Chen, Jianping
    Sun, Xiaohui
    Li, Yongchao
    Zhang, Yiwei
    Wang, Qing
    WATER, 2023, 15 (04)
  • [46] Sample size and predictive performance of machine learning methods with survival data: A simulation study
    Infante, Gabriele
    Miceli, Rosalba
    Ambrogi, Federico
    STATISTICS IN MEDICINE, 2023, 42 (30) : 5657 - 5675
  • [47] A COMPARISON OF 3 METHODS OF ESTIMATING DISPLACEMENT ON AN INSTRUMENTED SINGLE PUNCH MACHINE
    MUNOZRUIZ, A
    GALLEGO, R
    DELPOZO, M
    JIMENEZCASTELLANOS, MR
    DOMINGUEZABASCAL, J
    DRUG DEVELOPMENT AND INDUSTRIAL PHARMACY, 1995, 21 (02) : 215 - 227
  • [48] Estimating Surface Downward Longwave Radiation Using Machine Learning Methods
    Feng, Chunjie
    Zhang, Xiaotong
    Wei, Yu
    Zhang, Weiyu
    Hou, Ning
    Xu, Jiawen
    Jia, Kun
    Yao, Yunjun
    Xie, Xianhong
    Jiang, Bo
    Cheng, Jie
    Zhao, Xiang
    ATMOSPHERE, 2020, 11 (11)
  • [49] Evaluation of Machine Learning and Traditional Methods for Estimating Compressive Strength of UHPC
    Li, Tianlong
    Jiang, Pengxiao
    Qian, Yunfeng
    Yang, Jianyu
    Alateah, Ali H.
    Alsubeai, Ali
    Alfares, Abdulgafor M.
    Sufian, Muhammad
    BUILDINGS, 2024, 14 (09)
  • [50] Predicting Hypertension Based on Machine Learning Methods: A Case Study in Northwest Vietnam
    Tran Thi Oanh
    Nguyen Thanh Tung
    MOBILE NETWORKS & APPLICATIONS, 2022, 27 (05): : 2013 - 2023