cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs

被引:18
|
作者
Tolstoganov, Ivan [1 ]
Bankevich, Anton [2 ]
Chen, Zhoutao [3 ]
Pevzner, Pavel A. [1 ,2 ]
机构
[1] St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg, Russia
[2] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[3] Universal Sequencing Technol Corp, Carlsbad, CA USA
基金
俄罗斯科学基金会;
关键词
DNA EXTRACTION; GENOME; ACCURATE;
D O I
10.1093/bioinformatics/btz349
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. Results We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. Availability and implementation Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. Supplementary Information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:I61 / I70
页数:10
相关论文
共 50 条
  • [1] Assembly of long error-prone reads using de Bruijn graphs
    Lin, Yu
    Yuan, Jeffrey
    Kolmogorov, Mikhail
    Shen, Max W.
    Chaisson, Mark
    Pevzner, Pavel A.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (52) : E8396 - E8405
  • [2] Accurate self-correction of errors in long reads using de Bruijn graphs
    Salmela, Leena
    Walve, Riku
    Rivals, Eric
    Ukkonen, Esko
    BIOINFORMATICS, 2017, 33 (06) : 799 - 806
  • [3] Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads
    Anton Bankevich
    Andrey V. Bzikadze
    Mikhail Kolmogorov
    Dmitry Antipov
    Pavel A. Pevzner
    Nature Biotechnology, 2022, 40 : 1075 - 1081
  • [4] Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads
    Bankevich, Anton
    Bzikadze, Andrey V.
    Kolmogorov, Mikhail
    Antipov, Dmitry
    Pevzner, Pavel A.
    NATURE BIOTECHNOLOGY, 2022, 40 (07) : 1075 - +
  • [5] Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer
    Ekim, Baris
    Berger, Bonnie
    Chikhi, Rayan
    CELL SYSTEMS, 2021, 12 (10) : 958 - +
  • [6] DNA Assembly with De Bruijn Graphs Using an FPGA Platform
    Poirier, Carl
    Gosselin, Benoit
    Fortier, Paul
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (03) : 1003 - 1009
  • [7] De novo assembly and genotyping of variants using colored de Bruijn graphs
    Zamin Iqbal
    Mario Caccamo
    Isaac Turner
    Paul Flicek
    Gil McVean
    Nature Genetics, 2012, 44 : 226 - 232
  • [8] DNA Assembly with de Bruijn Graphs on FPGA
    Poirier, Carl
    Gosselin, Benoit
    Fortier, Paul
    2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 6489 - 6492
  • [9] De novo assembly and genotyping of variants using colored de Bruijn graphs
    Iqbal, Zamin
    Caccamo, Mario
    Turner, Isaac
    Flicek, Paul
    McVean, Gil
    NATURE GENETICS, 2012, 44 (02) : 226 - 232
  • [10] Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
    Zerbino, Daniel R.
    Birney, Ewan
    GENOME RESEARCH, 2008, 18 (05) : 821 - 829