Spaler: Spark and GraphX based de novo genome assembler

被引:0
|
作者
Abu-Doleh, Anas [1 ]
Catalyurek, Umit V. [2 ]
机构
[1] Ohio State Univ, Dept Elect & Comp Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Dept Elect & Comp Engn, Dept Biomed Informat, Columbus, OH 43210 USA
关键词
distributed sequence assembly; de novo assembly; de Bruijn graph; Spark;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent advancements in high-throughput genome sequencing technologies have accelerated the efficient discovery of novel genomes. De novo assembly is the first and one of the most computationally intensive step to analyze such novel genomes. In this work, we addressed the problem of parallelizing the de Bruijn graph based de novo genome sequence assembly on distributed memory systems. We proposed a new tool, Spaler, which assembles short reads efficiently and accurately. Spaler is based on Spark framework and GraphX API. We compared the performance of Spaler to other distributed memory based assemblers, in particular, ABySS, Ray and SWAP-Assembler. The results show that Spaler scales better than existing tools and produces comparable or better results in terms of solution quality.
引用
收藏
页码:1013 / 1018
页数:6
相关论文
共 50 条
  • [1] Yet another de novo genome assembler
    Vaser, Robert
    Sikic, Mile
    [J]. PROCEEDINGS OF THE 2019 11TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA 2019), 2019, : 147 - 151
  • [2] SWAP-Assembler 2: Optimization of De Novo Genome Assembler at Extreme Scale
    Meng, Jintao
    Seo, Sangmin
    Balaji, Pavan
    Wei, Yanjie
    Wang, Bingqiang
    Feng, Shenzhong
    [J]. PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, : 195 - 204
  • [3] A de novo Genome Assembler based on MapReduce and Bi-directed de Bruijn Graph
    Zhang, Yuehua
    Xuan, Pengfei
    Wang, Yunsheng
    Srimani, Pradip K.
    Luo, Feng
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 65 - 71
  • [4] HipMer: An Extreme-Scale De Novo Genome Assembler
    Georganas, Evangelos
    Buluc, Aydin
    Chapman, Jarrod
    Hofmeyr, Steven
    Aluru, Chaitanya
    Egan, Rob
    Oliker, Leonid
    Rokhsar, Daniel
    Yelick, Katherine
    [J]. PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,
  • [5] PadeNA: A PARALLEL DE NOVO ASSEMBLER
    Thareja, Gaurav
    Kumar, Vivek
    Zyskowski, Mike
    Mercer, Simon
    Davidson, Bob
    [J]. BIOINFORMATICS 2011, 2011, : 196 - +
  • [7] De novo short read assembler
    不详
    [J]. NATURE METHODS, 2012, 9 (02) : 125 - 125
  • [8] Assembler for de novo assembly of large genomes
    Chu, Te-Chin
    Lu, Chen-Hua
    Liu, Tsunglin
    Lee, Greg C.
    Li, Wen-Hsiung
    Shih, Arthur Chun-Chieh
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (36) : E3417 - E3424
  • [9] Tedna: a transposable element de novo assembler
    Zytnicki, Matthias
    Akhunov, Eduard
    Quesneville, Hadi
    [J]. BIOINFORMATICS, 2014, 30 (18) : 2656 - 2658
  • [10] ArrOW: Experiencing a Parallel Cloud-based De Novo Assembler Workflow
    Ocana, Kary
    Guedes, Thaylon
    de Oliveira, Daniel
    [J]. 2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 185 - 190