Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study

被引:15
|
作者
Wang, Hanxue [1 ,2 ]
Cui, Wenjuan [1 ]
Guo, Yunchang [3 ]
Du, Yi [1 ,2 ]
Zhou, Yuanchun [1 ,2 ]
机构
[1] Chinese Acad Sci, Comp Network Informat Ctr, 4 South Fourth St, Beijing 100190, Peoples R China
[2] Chinese Acad Sci Univ, Beijing, Peoples R China
[3] China Natl Ctr Food Safety Risk Assessment, Beijing, Peoples R China
关键词
foodborne disease; pathogens prediction; machine learning; UNITED-STATES; SURVEILLANCE; NETWORK; OUTBREAKS; FOOD; ILLNESS; TRENDS;
D O I
10.2196/24924
中图分类号
R-058 [];
学科分类号
摘要
Background: Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in clinical features, and there is a low proportion of clinically actual pathogen detection in real life. Objective: We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens that have not been tested. Methods: We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationship between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of 4 models to obtain the pathogen prediction model with the highest accuracy. Results: The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that they play important roles in classifying the foodborne disease pathogens. Conclusions: Data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Machine learning-enabled prediction of antimicrobial resistance in foodborne pathogens
    Yun, Bona
    Liao, Xinyu
    Feng, Jinsong
    Ding, Tian
    [J]. CYTA-JOURNAL OF FOOD, 2024, 22 (01)
  • [2] Identifying Pathogens of Foodborne Diseases with Machine Learning
    Wang, Hanxue
    Cui, Wenjuan
    Zhou, Yuanchun
    Du, Yi
    [J]. Data Analysis and Knowledge Discovery, 2021, 5 (09) : 54 - 62
  • [3] Development and validation of a machine learning algorithm prediction for dense granule proteins in Apicomplexa
    Zhenxiao Lu
    Hang Hu
    Yashan Song
    Siyi Zhou
    Olalekan Opeyemi Ayanniyi
    Qianming Xu
    Zhenyu Yue
    Congshan Yang
    [J]. Parasites & Vectors, 16
  • [4] Development and validation of a machine learning algorithm prediction for dense granule proteins in Apicomplexa
    Lu, Zhenxiao
    Hu, Hang
    Song, Yashan
    Zhou, Siyi
    Ayanniyi, Olalekan Opeyemi
    Xu, Qianming
    Yue, Zhenyu
    Yang, Congshan
    [J]. PARASITES & VECTORS, 2023, 16 (01)
  • [5] A Machine Learning Approach for the Prediction of Testicular Sperm Extraction in Nonobstructive Azoospermia: Algorithm Development and Validation Study
    Bachelot, Guillaume
    Dhombres, Ferdinand
    Sermondade, Nathalie
    Hamid, Rahaf Haj
    Berthaut, Isabelle
    Frydman, Valentine
    Prades, Marie
    Kolanska, Kamila
    Selleret, Lise
    Mathieu-D'Argent, Emmanuelle
    Rivet-Danon, Diane
    Levy, Rachel
    Lamaziere, Antonin
    Dupont, Charlotte
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [6] Development and Validation of a Machine Learning Algorithm for Prediction of Platelet Transfusion Efficiency in Patients with Hematological Diseases
    Wang, Meng
    Cheng, Jian
    Li, Xiuwen
    Chen, Baoan
    [J]. BLOOD, 2019, 134
  • [7] Machine Learning Prediction of Hypoglycemia and Hyperglycemia From Electronic Health Records: Algorithm Development and Validation
    Witte, Harald
    Nakas, Christos
    Bally, Lia
    Leichtle, Alexander Benedikt
    [J]. JMIR FORMATIVE RESEARCH, 2022, 6 (07)
  • [8] Validation of the PREsTo machine learning algorithm for the prediction of disease progression in patients with primary sclerosing cholangitis
    Eaton, John
    Lazaridis, Konstantinos
    Invernizzi, Pietro
    Chazouilleres, Olivier
    Hirschfield, Gideon
    Metselaar, Herold
    Gronbaek, Henning
    Lu, Xiaomin
    Chung, Chuhan
    Subramanian, Mani
    Myers, Robert
    Mccauley, Bryan
    Atkinson, Elizabeth
    Juran, Brian
    Goodman, Zachary
    Manns, Michael P.
    Bowlus, Christopher
    Levy, Cynthia
    Muir, Andrew
    [J]. JOURNAL OF HEPATOLOGY, 2019, 70 (01) : E390 - E391
  • [9] A Racially Unbiased, Machine Learning Approach to Prediction of Mortality: Algorithm Development Study
    Allen, Angier
    Mataraso, Samson
    Siefkas, Anna
    Burdick, Hoyt
    Braden, Gregory
    Dellinger, R. Phillip
    McCoy, Andrea
    Pellegrini, Emily
    Hoffman, Jana
    Green-Saxena, Abigail
    Barnes, Gina
    Calvert, Jacob
    Das, Ritankar
    [J]. JMIR PUBLIC HEALTH AND SURVEILLANCE, 2020, 6 (04): : 303 - 311
  • [10] Prediction of Heart Disease using Machine Learning Algorithm
    Varale, Viraj S.
    Thakre, Kalpana S.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 287 - 290