Efficient de Bruijn Graph Construction For Genome Assembly Using a Hash Table and Auxiliary Vector Data Structures

被引:0
|
作者
Limon, Mahfuzer Rahman [1 ]
Sharker, Ratul [1 ]
Biswas, Sajib [1 ]
Rahman, M. Sohel [1 ]
机构
[1] Bangladesh Univ Engn & Technol, Dept Comp Sci & Engn, Dhaka 1000, Bangladesh
关键词
Computer Science; genome assembly; de Bruijn graph; Vector; Hashtable;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern next-generation sequencing technologies can generate huge volumes of data. One popular and useful tool to analyze these huge amount of data is the so called de Bruijn graph. Because of the huge number of nodes, in de Bruijn Graph based genome assembly the main barrier is the memory and runtime. And, this area has been the focus of significant attention in the contemporary literature. We present an algorithm that makes a balance between memory and runtime. Our approach stores the de Bruijn graph in a hash table with an auxiliary data structure which improves the total memory usage and runtime with no false positives. In the whole assembly process, generally the graph construction procedure takes the major share of the time. Our approach presents significant advancement in this aspect. All the data files (in FASTA format) along with the program code are available for downloaded at the following link: https://drive.google.com/folderview?id=0B3D-hZtRZ933V1dMOVBHUkNJM00&usp=sharing Please contact M. Sohel Rahman (msrahman@cse.buet.ac.bd)
引用
收藏
页码:121 / 126
页数:6
相关论文
共 50 条
  • [1] HaVec: An Efficient de Bruijn Graph Construction Algorithm for Genome Assembly
    Limon, Mahfuzer Rahman
    Sharker, Ratul
    Biswas, Sajib
    Rahman, M. Sohel
    [J]. INTERNATIONAL JOURNAL OF GENOMICS, 2017, 2017
  • [2] Parallelized De Bruijn graph construction and simplification for genome assembly
    [J]. Cheng, J.-F. (jiefengcheng@gmail.com), 1600, Chinese Academy of Sciences (24):
  • [3] Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly
    Georganas, Evangelos
    Buluc, Aydin
    Chapman, Jarrod
    Oliker, Leonid
    Rokhsar, Daniel
    Yelick, Katherine
    [J]. SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 437 - 448
  • [4] An Efficient GPU-based de Bruijn Graph Construction Algorithm for Micro-Assembly
    Ren, Shanshan
    Ahmed, Nauman
    Bertels, Koen
    Al-Ars, Zaid
    [J]. PROCEEDINGS 2018 IEEE 18TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2018, : 67 - 72
  • [5] Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers
    Kanak Mahadik
    Christopher Wright
    Milind Kulkarni
    Saurabh Bagchi
    Somali Chaterji
    [J]. Scientific Reports, 9
  • [6] A New Approach for De Bruijn Graph Construction in De Novo Genome Assembling
    de Armas, Elvismary Molina
    Castro, Liester Cruz
    Holanda, Maristela
    Lifschitz, Sergio
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1842 - 1849
  • [7] Overlap graphs and de Bruijn graphs: data structures for de novo genome assembly in the big data era
    Rizzi, Raffaella
    Beretta, Stefano
    Patterson, Murray
    Pirola, Yuri
    Previtali, Marco
    Della Vedova, Gianluca
    Bonizzoni, Paola
    [J]. QUANTITATIVE BIOLOGY, 2019, 7 (04) : 278 - 292
  • [8] Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers
    Mahadik, Kanak
    Wright, Christopher
    Kulkarni, Milind
    Bagchi, Saurabh
    Chaterji, Somali
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [9] Overlap graphs and de Bruijn graphs:data structures for de novo genome assembly in the big data era
    Raffaella Rizzi
    Stefano Beretta
    Murray Patterson
    Yuri Pirola
    Marco Previtali
    Gianluca Della Vedova
    Paola Bonizzoni
    [J]. Quantitative Biology., 2019, 7 (04) - 292
  • [10] Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph
    Mukherjee, Kingshuk
    Rossi, Massimiliano
    Salmela, Leena
    Boucher, Christina
    [J]. ALGORITHMS FOR MOLECULAR BIOLOGY, 2021, 16 (01)