Scalable de novo classification of antibiotic resistance of Mycobacterium tuberculosis

被引:0
|
作者
Serajian, Mohammadali [1 ]
Marini, Simone [2 ]
Alanko, Jarno N. [3 ]
Noyes, Noelle R. [4 ]
Prosperi, Mattia [2 ]
Boucher, Christina [1 ]
机构
[1] Univ Florida, Dept Comp & Informat Sci & Engn, 1889 Museum Rd, Gainesville, FL 32611 USA
[2] Univ Florida, Dept Epidemiol, POB 100231, Gainesville, FL 32601 USA
[3] Univ Helsinki, Dept Comp Sci, POB 4, Helsinki 00014, Finland
[4] Univ Minnesota, Dept Vet Populat Med, 1365 Gortner Ave, St Paul, MN 55108 USA
关键词
READ ALIGNMENT; GENOME; TOOL;
D O I
10.1093/bioinformatics/btae243
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: World Health Organization estimates that there were over 10 million cases of tuberculosis (TB) worldwide in 2019, resulting in over 1.4 million deaths, with a worrisome increasing trend yearly. The disease is caused by Mycobacterium tuberculosis (MTB) through airborne transmission. Treatment of TB is estimated to be 85% successful, however, this drops to 57% if MTB exhibits multiple antimicrobial resistance (AMR), for which fewer treatment options are available. Results: We develop a robust machine-learning classifier using both linear and nonlinear models (i.e. LASSO logistic regression (LR) and random forests (RF)) to predict the phenotypic resistance of Mycobacterium tuberculosis (MTB) for a broad range of antibiotic drugs. We use data from the CRyPTIC consortium to train our classifier, which consists of whole genome sequencing and antibiotic susceptibility testing (AST) phenotypic data for 13 different antibiotics. To train our model, we assemble the sequence data into genomic contigs, identify all unique 31-mers in the set of contigs, and build a feature matrix M, where M[i, j] is equal to the number of times the ith 31-mer occurs in the jth genome. Due to the size of this feature matrix (over 350 million unique 31-mers), we build and use a sparse matrix representation. Our method, which we refer to as MTB++, leverages compact data structures and iterative methods to allow for the screening of all the 31-mers in the development of both LASSO LR and RF. MTB++ is able to achieve high discrimination (F-1 >80%) for the first-line antibiotics. Moreover, MTB++ had the highest F-1 score in all but three classes and was the most comprehensive since it had an F-1 score >75% in all but four (rare) antibiotic drugs. We use our feature selection to contextualize the 31-mers that are used for the prediction of phenotypic resistance, leading to some insights about sequence similarity to genes in MEGARes. Lastly, we give an estimate of the amount of data that is needed in order to provide accurate predictions.
引用
收藏
页码:i39 / i47
页数:9
相关论文
共 50 条
  • [21] A large scale evaluation of TBProfiler and Mykrobe for antibiotic resistance prediction in Mycobacterium tuberculosis
    Mahe, Pierre
    El Azami, Meriem
    Barlas, Philippine
    Tournoud, Maud
    PEERJ, 2019, 7
  • [22] Phylogenetically informative mutations in genes implicated in antibiotic resistance in Mycobacterium tuberculosis complex
    Merker, Matthias
    Kohl, Thomas A.
    Barilar, Ivan
    Andres, Soenke
    Fowler, Philip W.
    Chryssanthou, Erja
    Angeby, Kristian
    Jureen, Pontus
    Moradigaravand, Danesh
    Parkhill, Julian
    Peacock, Sharon J.
    Schon, Thomas
    Maurer, Florian P.
    Walker, Timothy
    Koser, Claudio
    Niemann, Stefan
    GENOME MEDICINE, 2020, 12 (01)
  • [23] Phylogenetically informative mutations in genes implicated in antibiotic resistance in Mycobacterium tuberculosis complex
    Matthias Merker
    Thomas A. Kohl
    Ivan Barilar
    Sönke Andres
    Philip W. Fowler
    Erja Chryssanthou
    Kristian Ängeby
    Pontus Jureen
    Danesh Moradigaravand
    Julian Parkhill
    Sharon J. Peacock
    Thomas Schön
    Florian P. Maurer
    Timothy Walker
    Claudio Köser
    Stefan Niemann
    Genome Medicine, 12
  • [24] Targeting antibiotic resistance mechanisms in Mycobacterium tuberculosis: recharging the old magic bullets
    Nguyen, Liem
    EXPERT REVIEW OF ANTI-INFECTIVE THERAPY, 2012, 10 (09) : 963 - 965
  • [25] Discovery of the first macrolide antibiotic binding protein in Mycobacterium tuberculosis: a new antibiotic resistance drug target
    Qingqing Zhang
    Huijuan Liu
    Xiang Liu
    Dunquan Jiang
    Bingjie Zhang
    Hongliang Tian
    Cheng Yang
    Luke WGuddat
    Haitao Yang
    Kaixia Mi
    Zihe Rao
    Protein & Cell, 2018, 9 (11) : 971 - 998
  • [26] Discovery of the first macrolide antibiotic binding protein in Mycobacterium tuberculosis: a new antibiotic resistance drug target
    Zhang, Qingqing
    Liu, Huijuan
    Liu, Xiang
    Jiang, Dunquan
    Zhang, Bingjie
    Tian, Hongliang
    Yang, Cheng
    Guddat, Luke W.
    Yang, Haitao
    Mi, Kaixia
    Rao, Zihe
    PROTEIN & CELL, 2018, 9 (11) : 971 - 975
  • [27] Tuberculosis and antibiotic resistance
    Prakash, CS
    CURRENT SCIENCE, 2002, 82 (01): : 17 - 18
  • [28] Genotypic determination of Mycobacterium tuberculosis antibiotic resistance using a novel mutation detection method, the branch migration inhibition M-tuberculosis antibiotic resistance test
    Liu, YP
    Behr, MA
    Small, PM
    Kurn, N
    JOURNAL OF CLINICAL MICROBIOLOGY, 2000, 38 (10) : 3656 - 3662
  • [29] An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis
    Mitra, Pralay
    Shultis, David
    Brender, Jeffrey R.
    Czajka, Jeff
    Marsh, David
    Gray, Felicia
    Cierpicki, Tomasz
    Zhang, Yang
    PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (10)
  • [30] Chemical Genetic Interaction Profiling Reveals Determinants of Intrinsic Antibiotic Resistance in Mycobacterium tuberculosis
    Xu, Weizhen
    DeJesus, Michael A.
    Rucker, Nadine
    Engelhart, Curtis A.
    Wright, Meredith G.
    Healy, Claire
    Lin, Kan
    Wang, Ruojun
    Park, Sae Woong
    Ioerger, Thomas R.
    Schnappinger, Dirk
    Ehrt, Sabine
    ANTIMICROBIAL AGENTS AND CHEMOTHERAPY, 2017, 61 (12)