Predicting antimicrobial resistance using conserved genes

被引:27
|
作者
Nguyen, Marcus [1 ,2 ]
Olson, Robert [1 ,2 ]
Shukla, Maulik [1 ,2 ]
VanOeffelen, Margo [3 ]
Davis, James J. [1 ,2 ,3 ,4 ]
机构
[1] Argonne Natl Lab, Div Data Sci & Learning, 9700 S Cass Ave, Argonne, IL 60439 USA
[2] Univ Chicago, Consortium Adv Sci & Engn, Chicago, IL 60637 USA
[3] Fellowship Interpretat Genomes, Burr Ridge, IL 60527 USA
[4] Northwestern Argonne Inst Sci & Engn, Evanston, IL 60208 USA
关键词
METHICILLIN-RESISTANCE; ESCHERICHIA-COLI; IDENTIFICATION; PERFORMANCE; ADAPTATION; MUTATIONS; SURVIVAL; BIOFILM; PROTEIN; SINGLE;
D O I
10.1371/journal.pcbi.1008319
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Author summary Machine learning models for predicting AMR phenotypes from sequence data are often built using features derived from well-studied sets of AMR genes, or from whole genome sequences. In this study, we build models using core genes that are held in common among the members of a species and that are not known to confer antimicrobial resistance based on their annotations. We find that there is sufficient variation in these core conserved genes to produce models with accuracies greater than or equal to 80% in four species, using as few as 100 genes. However, we note that these models are less accurate than models built from whole genomes or lists of AMR genes. The results of this study suggest that variations relating to, or co-occurring with AMR are extensive, and that it is possible to use conserved non-AMR genes to predict AMR phenotypes. A growing number of studies are using machine learning models to accurately predict antimicrobial resistance (AMR) phenotypes from bacterial sequence data. Although these studies are showing promise, the models are typically trained using features derived from comprehensive sets of AMR genes or whole genome sequences and may not be suitable for use when genomes are incomplete. In this study, we explore the possibility of predicting AMR phenotypes using incomplete genome sequence data. Models were built from small sets of randomly-selected core genes after removing the AMR genes. ForKlebsiella pneumoniae,Mycobacterium tuberculosis,Salmonella enterica, andStaphylococcus aureus, we report that it is possible to classify susceptible and resistant phenotypes with average F1 scores ranging from 0.80-0.89 with as few as 100 conserved non-AMR genes, with very major error rates ranging from 0.11-0.23 and major error rates ranging from 0.10-0.20. Models built from core genes have predictive power in cases where the primary AMR mechanisms result from SNPs or horizontal gene transfer. By randomly sampling non-overlapping sets of core genes, we show that F1 scores and error rates are stable and have little variance between replicates. Although these small core gene models have lower accuracies and higher error rates than models built from the corresponding assembled genomes, the results suggest that sufficient variation exists in the core non-AMR genes of a species for predicting AMR phenotypes.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Antimicrobial resistance bacteria and genes detected in hospital sewage provide valuable information in predicting clinical antimicrobial resistance
    Cai, Leshan
    Sun, Jiayu
    Yao, Fen
    Yuan, Yumeng
    Zeng, Mi
    Zhang, Qiaoxin
    Xie, Qingdong
    Wang, Shiwei
    Wang, Zhen
    Jiao, Xiaoyang
    SCIENCE OF THE TOTAL ENVIRONMENT, 2021, 795
  • [2] Predicting effects of changed antimicrobial usage on the abundance of antimicrobial resistance genes in finisher' gut microbiomes
    Andersen, V. D.
    Aarestrup, F. M.
    Munk, P.
    Jensen, M. S.
    de Knegt, L., V
    Bortolaia, V
    Knudsen, B. E.
    Lukjancenko, O.
    Birkegard, A. C.
    Vigre, H.
    PREVENTIVE VETERINARY MEDICINE, 2020, 174
  • [3] Predicting Antimicrobial Resistance Using Partial Genome Alignments
    Aytan-Aktug, D.
    Nguyen, M.
    Clausen, P. T. L. C.
    Stevens, R. L.
    Aarestrup, F. M.
    Lund, O.
    Davis, J. J.
    MSYSTEMS, 2021, 6 (03)
  • [4] Predicting variable gene content in Escherichia coli using conserved genes
    Nguyen, Marcus
    Elmore, Zachary
    Ihle, Clay
    Moen, Francesco S.
    Slater, Adam D.
    Turner, Benjamin N.
    Parrello, Bruce
    Best, Aaron A.
    Davis, James J.
    MSYSTEMS, 2023, 8 (04)
  • [5] Predicting future hospital antimicrobial resistance prevalence using machine learning
    Vihta, Karina-Doris
    Pritchard, Emma
    Pouwels, Koen B.
    Hopkins, Susan
    Guy, Rebecca L.
    Henderson, Katherine
    Chudasama, Dimple
    Hope, Russell
    Muller-Pebody, Berit
    Walker, Ann Sarah
    Clifton, David
    Eyre, David W.
    COMMUNICATIONS MEDICINE, 2024, 4 (01):
  • [6] Predicting Relative Risk of Antimicrobial Resistance using Machine Learning Methods
    Wu, Ying
    Jiang, Peng
    Goh, Shin Giek
    Yu, Kaifeng
    Chen, Yihan
    He, Yiliang
    Gin, Karina Y. H.
    IFAC PAPERSONLINE, 2022, 55 (10): : 1266 - 1271
  • [7] Predicting antimicrobial resistance of bacterial pathogens using time series analysis
    Kim, Jeonghoon
    Rupasinghe, Ruwini
    Halev, Avishai
    Huang, Chao
    Rezaei, Shahbaz
    Clavijo, Maria. J. J.
    Robbins, Rebecca. C. C.
    Martinez-Lopez, Beatriz
    Liu, Xin
    FRONTIERS IN MICROBIOLOGY, 2023, 14
  • [8] Predicting the emergence of antimicrobial resistance - Reply
    Doern, GV
    CLINICAL INFECTIOUS DISEASES, 2002, 34 (10) : 1418 - 1420
  • [9] Predicting antimicrobial resistance of Lactococcus garvieae: PCR detection of resistance genes versus MALDI-TOF protein profiling
    Torres-Corral, Yolanda
    Santos, Ysabel
    AQUACULTURE, 2022, 553
  • [10] Mobile Antimicrobial Resistance Genes in Probiotics
    Toth, Adrienn Greta
    Csabai, Istvan
    Judge, Maura Fiona
    Maroti, Gergely
    Becsei, Agnes
    Spisak, Sandor
    Solymosi, Norbert
    ANTIBIOTICS-BASEL, 2021, 10 (11):