Lessons Learnt From Using the Machine Learning Random Forest Algorithm to Predict Virulence in Streptococcus pyogenes

被引:3
|
作者
Buckley, Sean J. [1 ]
Harvey, Robert J. [1 ,2 ]
机构
[1] Univ Sunshine Coast, Sch Hlth & Behav Sci, Maroochydore, Qld, Australia
[2] Sunshine Coast Hlth Inst, Birtinya, Qld, Australia
关键词
Streptococcus pyogenes; machine learning; random forest; virulence; phenotype metadata; PLASMINOGEN;
D O I
10.3389/fcimb.2021.809560
中图分类号
R392 [医学免疫学]; Q939.91 [免疫学];
学科分类号
100102 ;
摘要
Group A Streptococcus is a globally significant human pathogen. The extensive variability of the GAS genome, virulence phenotypes and clinical outcomes, render it an excellent candidate for the application of genotype-phenotype association studies in the era of whole-genome sequencing. We have catalogued the distribution and diversity of the transcription regulators of GAS, and employed phylogenetics, concordance metrics and machine learning (ML) to test for associations. In this review, we communicate the lessons learnt in the context of the recent bacteria genotype-phenotype association studies of others that have utilised both genome-wide association studies (GWAS) and ML. We envisage a promising future for the application GWAS in bacteria genotype-phenotype association studies and foresee the increasing use of ML. However, progress in this field is hindered by several outstanding bottlenecks. These include the shortcomings that are observed when GWAS techniques that have been fine-tuned on human genomes, are applied to bacterial genomes. Furthermore, there is a deficit of easy-to-use end-to-end workflows, and a lag in the collection of detailed phenotype and clinical genomic metadata. We propose a novel quality control protocol for the collection of high-quality GAS virulence phenotype coupled to clinical outcome data. Finally, we incorporate this protocol into a workflow for testing genotype-phenotype associations using ML and 'linked' patient-microbe genome sets that better represent the infection event.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Application of the Random Forest Machine Learning Algorithm for Recognizing Patient Arm Movements While Using a Bionic Prosthesis
    Burtsev, N., I
    Shagdurov, V. C.
    Demkin, I. O.
    XIV RUSSIAN-GERMANY CONFERENCE ON BIOMEDICAL ENGINEERING (RGC-2019), 2019, 2140
  • [42] Machine learning in health condition check-up: An approach using Breiman's random forest algorithm
    Abd Algani Y.M.
    Ritonga M.
    Kiran Bala B.
    Al Ansari M.S.
    Badr M.
    Taloba A.I.
    Measurement: Sensors, 2022, 23
  • [43] Construction of a random survival forest model based on a machine learning algorithm to predict early recurrence after hepatectomy for adult hepatocellular carcinoma
    Ji Zhang
    Qing Chen
    Yu Zhang
    Jie Zhou
    BMC Cancer, 24 (1)
  • [44] Machine learning algorithm for Avocado image segmentation based on quantum enhancement and Random forest
    El Amraoui, Khalid
    Ezzaki, Ayoub
    Masmoudi, Lhoussaine
    Hadri, Majid
    El Belrhiti, Hicham
    El Ansari, Mohamed
    Amari, Aziz
    2022 2ND INTERNATIONAL CONFERENCE ON INNOVATIVE RESEARCH IN APPLIED SCIENCE, ENGINEERING AND TECHNOLOGY (IRASET'2022), 2022, : 149 - 155
  • [45] Using the rotation and random forest models of ensemble learning to predict landslide susceptibility
    Zhao, Lingran
    Wu, Xueling
    Niu, Ruiqing
    Wang, Ying
    Zhang, Kaixiang
    GEOMATICS NATURAL HAZARDS & RISK, 2020, 11 (01) : 1542 - 1564
  • [46] Improving Deep Learning Performance Using Random Forest HTM Cortical Learning Algorithm
    Abbas, Mohamed AbdElhamid
    PROCEEDINGS OF 2018 FIRST INTERNATIONAL WORKSHOP ON DEEP AND REPRESENTATION LEARNING (IWDRL), 2018, : 13 - 18
  • [47] Using machine learning to predict mortality in older patients with cancer: Decision tree and random forest analyses from the ELCAPA and ONCODAGE prospective cohorts.
    Audureau, Etienne
    Soubeyran, Pierre-Louis
    Martinez-Tapia, Claudia
    Bellera, Carine A.
    Bastuji-Garin, Sylvie
    Boudou-Rouquette, Pascaline
    Rainfray, Muriel
    Chahwakilian, Anne
    Grellety, Thomas
    Hanon, Olivier
    Mathoulin-Pelissier, Simone
    Paillaud, Elena
    Canoui-Poitrine, Florence
    JOURNAL OF CLINICAL ONCOLOGY, 2019, 37 (15)
  • [48] Using Random Forest Algorithm to Predict the Hydraulic Conductivity of Compacted Soil Liners/Covers
    Zhang, Poyu
    Tan, Yu
    Chen, Jiannan
    Nam, Boo Hyun
    GEO-CONGRESS 2023: SOIL IMPROVEMENT, GEOENVIRONMENTAL, AND SUSTAINABILITY, 2023, 339 : 193 - 200
  • [49] Mapping of the Maize Area Using Remotely Detected Multispectral and Radar Images Based on a Random Forest Machine Learning Algorithm
    Jombo, Simbarashe
    Abd Elbasit, Mohamed
    2024 IST-AFRICA CONFERENCE, 2024,
  • [50] Prediction of the Minimum Film Boiling Temperature of Quenching Vertical Rods in Water Using Random Forest Machine Learning Algorithm
    Alotaibi, Sorour
    Ebrahim, Shikha
    Salman, Ayed
    FRONTIERS IN ENERGY RESEARCH, 2021, 9 (09):