Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads

被引:26
|
作者
Davenport, Colin F. [1 ]
Neugebauer, Jens [1 ]
Beckmann, Nils [2 ]
Friedrich, Benedikt [2 ]
Kameri, Burim [2 ]
Kokott, Svea [1 ]
Paetow, Malte [2 ]
Siekmann, Bjoern [2 ]
Wieding-Drewes, Matthias [2 ]
Wienhoefer, Markus [2 ]
Wolf, Stefan [2 ]
Tuemmler, Burkhard [1 ]
Ahlers, Volker [2 ]
Sprengel, Frauke [2 ]
机构
[1] Hannover Med Sch, D-3000 Hannover, Lower Saxony, Germany
[2] Univ Appl Sci & Arts, Dept Comp Sci, Hannover, Lower Saxony, Germany
来源
PLOS ONE | 2012年 / 7卷 / 08期
基金
美国国家卫生研究院;
关键词
RIBOSOMAL-RNA; SEQUENCES; MICROBIOME; COMMUNITIES; ALIGNMENT; BACTERIA; TAXONOMY; ARCHAEA; SERVER; GENES;
D O I
10.1371/journal.pone.0041224
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] SpoTyping: fast and accurate in silico Mycobacterium spoligotyping from sequence reads
    Eryu Xia
    Yik-Ying Teo
    Rick Twee-Hee Ong
    Genome Medicine, 8
  • [42] Higher Classification Accuracy of Short Metagenomic Reads by Discriminative Spaced k-mers
    Ounit, Rachid
    Lonardi, Stefano
    ALGORITHMS IN BIOINFORMATICS (WABI 2015), 2015, 9289 : 286 - 295
  • [43] SpoTyping: fast and accurate in silico Mycobacterium spoligotyping from sequence reads
    Xia, Eryu
    Teo, Yik-Ying
    Ong, Rick Twee-Hee
    GENOME MEDICINE, 2016, 8
  • [44] Fast and accurate identification of semi-tryptic peptides in shotgun proteomics
    Alves, Pedro
    Arnold, Randy J.
    Clemmer, David E.
    Li, Yixue
    Reilly, James P.
    Sheng, Quanhu
    Tang, Haixu
    Xun, Zhiyin
    Zeng, Rong
    Radivojac, Predrag
    BIOINFORMATICS, 2008, 24 (01) : 102 - 109
  • [45] Fast and Accurate Taxonomic Assignments of Metagenomic Sequences Using MetaBin
    Sharma, Vineet K.
    Kumar, Naveen
    Prakash, Tulika
    Taylor, Todd D.
    PLOS ONE, 2012, 7 (04):
  • [46] Trowel: a fast and accurate error correction module for Illumina sequencing reads
    Lim, Eun-Cheon
    Mueller, Jonas
    Hagmann, Joerg
    Henz, Stefan R.
    Kim, Sang-Tae
    Weigel, Detlef
    BIOINFORMATICS, 2014, 30 (22) : 3264 - 3265
  • [47] The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads
    Wang, Zhiwen
    Hobson, Neil
    Galindo, Leonardo
    Zhu, Shilin
    Shi, Daihu
    McDill, Joshua
    Yang, Linfeng
    Hawkins, Simon
    Neutelings, Godfrey
    Datla, Raju
    Lambert, Georgina
    Galbraith, David W.
    Grassa, Christopher J.
    Geraldes, Armando
    Cronk, Quentin C.
    Cullis, Christopher
    Dash, Prasanta K.
    Kumar, Polumetla A.
    Cloutier, Sylvie
    Sharpe, Andrew G.
    Wong, Gane K. -S.
    Wang, Jun
    Deyholos, Michael K.
    PLANT JOURNAL, 2012, 72 (03): : 461 - 473
  • [48] Fast and Accurate Classification of Meta-Genomics Long Reads With deSAMBA
    Li, Gaoyang
    Liu, Yongzhuang
    Li, Deying
    Liu, Bo
    Li, Junyi
    Hu, Yang
    Wang, Yadong
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 9
  • [49] 16S Classifier: A Tool for Fast and Accurate Taxonomic Classification of 16S rRNA Hypervariable Regions in Metagenomic Datasets
    Chaudhary, Nikhil
    Sharma, Ashok K.
    Agarwal, Piyush
    Gupta, Ankit
    Sharma, Vineet K.
    PLOS ONE, 2015, 10 (02):
  • [50] MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks
    Gori, Fabio
    Folino, Gianluigi
    Jetten, Mike S. M.
    Marchiori, Elena
    BIOINFORMATICS, 2011, 27 (02) : 196 - 203