High-Efficiency Machine Learning Method for Identifying Foodborne Disease Outbreaks and Confounding Factors

被引:12
|
作者
Zhang, Peng [1 ,2 ]
Cui, Wenjuan [1 ]
Wang, Hanxue [1 ,2 ]
Du, Yi [1 ,2 ]
Zhou, Yuanchun [1 ,2 ]
机构
[1] Chinese Acad Sci, Comp Network Informat Ctr, Bldg 2,Software Pk 4,South Fourth St, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing, Peoples R China
关键词
foodborne disease outbreaks; machine learning; foodborne disease; SURVEILLANCE;
D O I
10.1089/fpd.2020.2913
中图分类号
TS2 [食品工业];
学科分类号
0832 ;
摘要
The China National Center for Food Safety Risk Assessment (CFSA) uses the Foodborne Disease Monitoring and Reporting System (FDMRS) to monitor outbreaks of foodborne diseases across the country. However, there are problems of underreporting or erroneous reporting in FDMRS, which significantly increase the cost of related epidemic investigations. To solve this problem, we designed a model to identify suspected outbreaks from the data generated by the FDMRS of CFSA. In this study, machine learning models were used to fit the data. The recall rate and F1-score were used as evaluation metrics to compare the classification performance of each model. Feature importance and pathogenic factors were identified and analyzed using tree-based and gradient boosting models. Three real foodborne disease outbreaks were then used to evaluate the best performing model. Furthermore, the SHapley Additive exPlanation value was used to identify the effect of features. Among all machine learning classification models, the eXtreme Gradient Boosting (XGBoost) model achieved the best performance, with the highest recall rate and F1-score of 0.9699 and 0.9582, respectively. In terms of model validation, the model provides a correct judgment of real outbreaks. In the feature importance analysis with the XGBoost model, the health status of the other people with the same exposure has the highest weight, reaching 0.65. The machine learning model built in this study exhibits high accuracy in recognizing foodborne disease outbreaks, thus reducing the manual burden for medical staff. The model helped us identify the confounding factors of foodborne disease outbreaks. Attention should be paid not only to the health status of those with the same exposure but also to the similarity of the cases in time and space.
引用
收藏
页码:590 / 598
页数:9
相关论文
共 50 条
  • [21] Machine learning approach as an early warning system to prevent foodborne Salmonella outbreaks in northwestern Italy
    Garcia-Vozmediano, Aitor
    Maurella, Cristiana
    Ceballos, Leonardo A.
    Crescio, Elisabetta
    Meo, Rosa
    Martelli, Walter
    Pitti, Monica
    Lombardi, Daniela
    Meloni, Daniela
    Pasqualini, Chiara
    Ru, Giuseppe
    [J]. VETERINARY RESEARCH, 2024, 55 (01)
  • [22] Machine-learning-assisted metasurface design for high-efficiency thermal emitter optimization
    Kudyshev, Zhaxylyk A.
    Kildishev, Alexander V.
    Shalaev, Vladimir M.
    Boltasseva, Alexandra
    [J]. APPLIED PHYSICS REVIEWS, 2020, 7 (02)
  • [23] High-Efficiency Non-Fullerene Acceptors Developed by Machine Learning and Quantum Chemistry
    Zhang, Qi
    Zheng, Yu Jie
    Sun, Wenbo
    Ou, Zeping
    Odunmbaku, Omololu
    Li, Meng
    Chen, Shanshan
    Zhou, Yongli
    Li, Jing
    Qin, Bo
    Sun, Kuan
    [J]. ADVANCED SCIENCE, 2022, 9 (06)
  • [24] A high-efficiency aerothermoelastic analysis method
    ZhiQiang Wan
    YaoKun Wang
    YunZhen Liu
    Chao Yang
    [J]. Science China Physics, Mechanics & Astronomy, 2014, 57 : 1111 - 1118
  • [25] A high-efficiency aerothermoelastic analysis method
    Wan ZhiQiang
    Wang YaoKun
    Liu YunZhen
    Yang Chao
    [J]. SCIENCE CHINA-PHYSICS MECHANICS & ASTRONOMY, 2014, 57 (06) : 1111 - 1118
  • [26] A high-efficiency aerothermoelastic analysis method
    WAN ZhiQiang
    WANG YaoKun
    LIU YunZhen
    YANG Chao
    [J]. Science China(Physics,Mechanics & Astronomy), 2014, Mechanics & Astronomy)2014 (06) : 1111 - 1118
  • [27] Confounding factors need to be accounted for in assessing bias by machine learning algorithms
    Pritam Mukherjee
    Thomas C. Shen
    Jianfei Liu
    Tejas Mathai
    Omid Shafaat
    Ronald M. Summers
    [J]. Nature Medicine, 2022, 28 : 1159 - 1160
  • [28] Confounding factors need to be accounted for in assessing bias by machine learning algorithms
    Mukherjee, Pritam
    Shen, Thomas C.
    Liu, Jianfei
    Mathai, Tejas
    Shafaat, Omid
    Summers, Ronald M.
    [J]. NATURE MEDICINE, 2022, 28 (06) : 1159 - +
  • [29] Crowdsourcing and machine learning approaches for extracting entities indicating potential foodborne outbreaks from social media
    Tao, Dandan
    Zhang, Dongyu
    Hu, Ruofan
    Rundensteiner, Elke
    Feng, Hao
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)