SeroBA: rapid high-throughput serotyping of Streptococcus pneumoniae from whole genome sequence data

被引:84
|
作者
Epping, Lennard [1 ,2 ]
van Tonder, Andries J. [3 ]
Gladstone, Rebecca A. [3 ]
Bentley, Stephen D. [3 ]
Page, Andrew J. [1 ,4 ]
Keane, Jacqueline A. [1 ]
机构
[1] Wellcome Sanger Inst, Pathogen Informat, Hinxton CB10 1SA, Cambs, England
[2] Robert Koch Inst, Microbial Genom, Berlin, Germany
[3] Wellcome Sanger Inst, Infect Genom, Hinxton CB10 1SA, Cambs, England
[4] Norwich Res Pk, Quadram Inst, Norwich, Norfolk, England
来源
MICROBIAL GENOMICS | 2018年 / 4卷 / 07期
基金
英国惠康基金;
关键词
Streptococcus pneumoniae; serotyping; pneumococcal; whole genome sequencing; k-mer method; PNEUMOCOCCAL DISEASE; VACCINATION; DISCOVERY; CHILDREN; LOCUS; PCR;
D O I
10.1099/mgen.0.000186
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Streptococcus pneumoniae is responsible for 240 000-460 000 deaths in children under 5 years of age each year. Accurate identification of pneumococcal serotypes is important for tracking the distribution and evolution of serotypes following the introduction of effective vaccines. Recent efforts have been made to infer serotypes directly from genomic data but current software approaches are limited and do not scale well. Here, we introduce a novel method, SeroBA, which uses a k-mer approach. We compare SeroBA against real and simulated data and present results on the concordance and computational performance against a validation dataset, the robustness and scalability when analysing a large dataset, and the impact of varying the depth of coverage on sequence-based serotyping. SeroBA can predict serotypes, by identifying the cps locus, directly from raw whole genome sequencing read data with 98 % concordance using a k-mer-based method, can process 10 000 samples in just over 1 day using a standard server and can call serotypes at a coverage as low as 15-21x. SeroBA is implemented in Python3 and is freely available under an open source GPLv3 licence from: https://github.com/sangerpathogens/seroba
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Whole Genome High-Throughput Screen Identified microRNAs Enhancing rAAV Production
    Finkbeiner, Bettina
    Burkhart, Madina
    Reichl, Sabrina
    Derler, Rupert
    Schulze, Andreas
    Sonntag, Florian
    Otte, Kerstin
    Hoerer, Markus
    MOLECULAR THERAPY, 2024, 32 (04) : 506 - 507
  • [42] Genome Detective: an automated system for virus identification from high-throughput sequencing data
    Vilsker, Michael
    Moosa, Yumna
    Nooij, Sam
    Fonseca, Vagner
    Ghysens, Yoika
    Dumon, Korneel
    Pauwels, Raf
    Alcantara, Luiz Carlos
    Eynden, Ewout Vanden
    Vandamme, Anne-Mieke
    Deforche, Koen
    de Oliveira, Tulio
    BIOINFORMATICS, 2019, 35 (05) : 871 - 873
  • [43] Detecting circular RNA from high-throughput sequence data with de Bruijn graph
    Li, Xin
    Wu, Yufeng
    BMC GENOMICS, 2020, 21 (Suppl 1)
  • [44] Detecting circular RNA from high-throughput sequence data with de Bruijn graph
    Xin Li
    Yufeng Wu
    BMC Genomics, 21
  • [45] Rapid identification, capsular typing and molecular characterization of Streptococcus pneumoniae by using whole genome nanopore sequencing
    Garcia-Garcia, S.
    Perez-Arguello, A.
    Henares, D.
    Timoneda, N.
    Munoz-Almagro, C.
    BMC MICROBIOLOGY, 2020, 20 (01) : 347
  • [46] Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing
    Robert Pinard
    Alex de Winter
    Gary J Sarkis
    Mark B Gerstein
    Karrie R Tartaro
    Ramona N Plant
    Michael Egholm
    Jonathan M Rothberg
    John H Leamon
    BMC Genomics, 7
  • [47] Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing
    Pinard, Robert
    de Winter, Alex
    Sarkis, Gary J.
    Gerstein, Mark B.
    Tartaro, Karrie R.
    Plant, Ramona N.
    Egholm, Michael
    Rothberg, Jonathan M.
    Leamon, John H.
    BMC GENOMICS, 2006, 7 (1)
  • [48] Rapid and Easy In Silico Serotyping of Escherichia coli Isolates by Use of Whole-Genome Sequencing Data
    Joensen, Katrine G.
    Tetzschner, Anna M. M.
    Iguchi, Atsushi
    Aarestrup, Frank M.
    Scheutz, Flemming
    JOURNAL OF CLINICAL MICROBIOLOGY, 2015, 53 (08) : 2410 - 2426
  • [49] Rapid identification, capsular typing and molecular characterization of Streptococcus pneumoniae by using whole genome nanopore sequencing
    S. Garcia-Garcia
    A. Perez-Arguello
    D. Henares
    N. Timoneda
    C. Muñoz-Almagro
    BMC Microbiology, 20
  • [50] An Optimized Workflow Integrating High-Throughput DNA Extraction from FFPE Samples and Whole Genome Sequencing
    Guettouche, T.
    Rantus, J.
    Hedges, D.
    Slosek, K.
    Navarro, L.
    Pasco, Y.
    Tursi, R.
    Lalanne, E.
    Leyva, N.
    Konidari, I.
    Andersen, A.
    Diaz, A.
    Gentry, R.
    Hulme, W.
    Pericak-Vance, M.
    Gilbert, J.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2012, 14 (06): : 739 - 739