Genometa - A Fast and Accurate Classifier for Short Metagenomic Shotgun Reads

被引:26
|
作者
Davenport, Colin F. [1 ]
Neugebauer, Jens [1 ]
Beckmann, Nils [2 ]
Friedrich, Benedikt [2 ]
Kameri, Burim [2 ]
Kokott, Svea [1 ]
Paetow, Malte [2 ]
Siekmann, Bjoern [2 ]
Wieding-Drewes, Matthias [2 ]
Wienhoefer, Markus [2 ]
Wolf, Stefan [2 ]
Tuemmler, Burkhard [1 ]
Ahlers, Volker [2 ]
Sprengel, Frauke [2 ]
机构
[1] Hannover Med Sch, D-3000 Hannover, Lower Saxony, Germany
[2] Univ Appl Sci & Arts, Dept Comp Sci, Hannover, Lower Saxony, Germany
来源
PLOS ONE | 2012年 / 7卷 / 08期
基金
美国国家卫生研究院;
关键词
RIBOSOMAL-RNA; SEQUENCES; MICROBIOME; COMMUNITIES; ALIGNMENT; BACTERIA; TAXONOMY; ARCHAEA; SERVER; GENES;
D O I
10.1371/journal.pone.0041224
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A: Metagenomic studies use high-throughput sequence data to investigate microbial communities in situ. However, considerable challenges remain in the analysis of these data, particularly with regard to speed and reliable analysis of microbial species as opposed to higher level taxa such as phyla. We here present Genometa, a computationally undemanding graphical user interface program that enables identification of bacterial species and gene content from datasets generated by inexpensive high-throughput short read sequencing technologies. Our approach was first verified on two simulated metagenomic short read datasets, detecting 100% and 94% of the bacterial species included with few false positives or false negatives. Subsequent comparative benchmarking analysis against three popular metagenomic algorithms on an Illumina human gut dataset revealed Genometa to attribute the most reads to bacteria at species level (i.e. including all strains of that species) and demonstrate similar or better accuracy than the other programs. Lastly, speed was demonstrated to be many times that of BLAST due to the use of modern short read aligners. Our method is highly accurate if bacteria in the sample are represented by genomes in the reference sequence but cannot find species absent from the reference. This method is one of the most user-friendly and resource efficient approaches and is thus feasible for rapidly analysing millions of short reads on a personal computer.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] MGS-Fast: Metagenomic shotgun data fast annotation using microbial gene catalogs
    Brown, Stuart M.
    Chen, Hao
    Hao, Yuhan
    Laungani, Bobby P.
    Ali, Thahmina A.
    Dong, Changsu
    Lijeron, Carlos
    Kim, Baekdoo
    Wultsch, Claudia
    Pei, Zhiheng
    Krampis, Konstantinos
    GIGASCIENCE, 2019, 8 (04):
  • [32] ASElux: an ultra-fast and accurate allelic reads counter
    Miao, Zong
    Alvarez, Marcus
    Pajukanta, Paeivi
    Ko, Arthur
    BIOINFORMATICS, 2018, 34 (08) : 1313 - 1320
  • [33] Bartender: a fast and accurate clustering algorithm to count barcode reads
    Zhao, Lu
    Liu, Zhimin
    Levy, Sasha F.
    Wu, Song
    BIOINFORMATICS, 2018, 34 (05) : 739 - 747
  • [34] PlasGUN: gene prediction in plasmid metagenomic short reads using deep learning
    Fang, Zhencheng
    Tan, Jie
    Wu, Shufang
    Li, Mo
    Wang, Chunhui
    Liu, Yongchu
    Zhu, Huaiqiu
    BIOINFORMATICS, 2020, 36 (10) : 3239 - 3241
  • [35] Fast and accurate matching of cellular barcodes across short-reads and long-reads of single-cell RNA-seq experiments
    Ebrahimi, Ghazal
    Orabi, Baraa
    Robinson, Meghan
    Chauve, Cedric
    Flannigan, Ryan
    Hach, Faraz
    ISCIENCE, 2022, 25 (07)
  • [36] Statistical correction for functional metagenomic profiling of a microbial community with short NGS reads
    Du, Ruofei
    Fang, Zhide
    JOURNAL OF APPLIED STATISTICS, 2018, 45 (14) : 2521 - 2535
  • [37] Algorithms and strategies in short-read shotgun metagenomic reconstruction of plant communities
    Harbert, Robert S.
    APPLICATIONS IN PLANT SCIENCES, 2018, 6 (03):
  • [38] Short pyrosequencing reads suffice for accurate microbial community analysis
    Liu, Zongzhi
    Lozupone, Catherine
    Hamady, Micah
    Bushman, Frederic D.
    Knight, Rob
    NUCLEIC ACIDS RESEARCH, 2007, 35 (18)
  • [39] AccuRA: Accurate Alignment of Short Reads on Scalable Reconfigurable Accelerators
    Natarajan, Santhi
    Kumar, Krishna N.
    Pal, Dehnath
    Nandy, S. K.
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION (SAMOS), 2016, : 79 - 87
  • [40] SHRiMP: Accurate Mapping of Short Color-space Reads
    Rumble, Stephen M.
    Lacroute, Phil
    Dalca, Adrian V.
    Fiume, Marc
    Sidow, Arend
    Brudno, Michael
    PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)