Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly

被引:46
|
作者
Georganas, Evangelos [1 ,3 ]
Buluc, Aydin [1 ]
Chapman, Jarrod [2 ]
Oliker, Leonid [1 ]
Rokhsar, Daniel [2 ,4 ]
Yelick, Katherine [1 ,3 ]
机构
[1] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Natl Lab, Joint Genome Inst, Berkeley, CA USA
[3] Univ Calif Berkeley, Dept EECS, Berkeley, CA 94720 USA
[4] Univ Calif Berkeley, Mol & Cell Biol Dept, Berkeley, CA 94720 USA
关键词
K-MERS;
D O I
10.1109/SC.2014.41
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous fragments called reads. We study optimized parallelization of the most time-consuming phases of Meraculous, a state-of-the-art production assembler. First, we present a new parallel algorithm for k-mer analysis, characterized by intensive communication and I/O requirements, and reduce the memory requirements by 6.93x. Second, we efficiently parallelize de Bruijn graph construction and traversal, which necessitates a distributed hash table and is a key component of most de novo assemblers. We provide a novel algorithm that leverages one-sided communication capabilities of the Unified Parallel C (UPC) to facilitate the requisite fine-grained parallelism and avoidance of data hazards, while analytically proving its scalability properties. Overall results show unprecedented performance and efficient scaling on up to 15,360 cores of a Cray XC30, on human genome as well as the challenging wheat genome, with performance improvement from days to seconds.
引用
收藏
页码:437 / 448
页数:12
相关论文
共 50 条
  • [1] A New Approach for De Bruijn Graph Construction in De Novo Genome Assembling
    de Armas, Elvismary Molina
    Castro, Liester Cruz
    Holanda, Maristela
    Lifschitz, Sergio
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1842 - 1849
  • [2] Parallelizing Big De Bruijn Graph Traversal for Genome Assembly on GPU Clusters
    Qiu, Shuang
    Feng, Zonghao
    Luo, Qiong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 466 - 470
  • [3] Parallelized De Bruijn graph construction and simplification for genome assembly
    [J]. Cheng, J.-F. (jiefengcheng@gmail.com), 1600, Chinese Academy of Sciences (24):
  • [4] Parallel String Graph Construction and Transitive Reduction for De Novo Genome Assembly
    Guidi, Giulia
    Selvitopi, Oguz
    Ellis, Marquita
    Oliker, Leonid
    Yelick, Katherine
    Buluc, Aydm
    [J]. 2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 517 - 526
  • [5] HaVec: An Efficient de Bruijn Graph Construction Algorithm for Genome Assembly
    Limon, Mahfuzer Rahman
    Sharker, Ratul
    Biswas, Sajib
    Rahman, M. Sohel
    [J]. INTERNATIONAL JOURNAL OF GENOMICS, 2017, 2017
  • [6] A Classification of de Bruijn Graph Approaches for De Novo Fragment Assembly
    de Armas, Elvismary Molina
    Holanda, Maristela
    de Oliveira, Daniel
    Almeida, Nalvo F.
    Lifschitz, Sergio
    [J]. ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2020, 2020, 12558 : 1 - 12
  • [7] Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers
    Kanak Mahadik
    Christopher Wright
    Milind Kulkarni
    Saurabh Bagchi
    Somali Chaterji
    [J]. Scientific Reports, 9
  • [8] Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers
    Mahadik, Kanak
    Wright, Christopher
    Kulkarni, Milind
    Bagchi, Saurabh
    Chaterji, Somali
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [9] Exploration of de Bruijn graph filtering for de novo assembly using GraphLab
    Collet, Julien
    Sassolas, Tanguy
    Lhuillier, Yves
    Sirdey, Renaud
    Carlier, Jacques
    [J]. 2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 530 - 539
  • [10] Combining De Bruijn Graphs, Overlap Graphs and Microassembly for De Novo Genome Assembly
    Sergushichev, A. A.
    Alexandrov, A. V.
    Kazakov, S. V.
    Tsarev, F. N.
    Shalyto, A. A.
    [J]. IZVESTIYA SARATOVSKOGO UNIVERSITETA NOVAYA SERIYA-MATEMATIKA MEKHANIKA INFORMATIKA, 2013, 13 (02): : 10 - 10