HaVec: An Efficient de Bruijn Graph Construction Algorithm for Genome Assembly

被引:4
|
作者
Limon, Mahfuzer Rahman [1 ]
Sharker, Ratul [1 ]
Biswas, Sajib [1 ]
Rahman, M. Sohel [1 ]
机构
[1] BUET, Dept CSE, ECE Bldg West Palasi, Dhaka 1205, Bangladesh
关键词
D O I
10.1155/2017/6120980
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background. The rapid advancement of sequencing technologies has made it possible to regularly produce millions of high-quality reads from the DNA samples in the sequencing laboratories. To this end, the de Bruijn graph is a popular data structure in the genome assembly literature for efficient representation and processing of data. Due to the number of nodes in a de Bruijn graph, the main barrier here is the memory and runtime. Therefore, this area has received significant attention in contemporary literature. Results. In this paper, we present an approach called HaVec that attempts to achieve a balance between the memory consumption and the running time. HaVec uses a hash table along with an auxiliary vector data structure to store the de Bruijn graph thereby improving the total memory usage and the running time. A critical and noteworthy feature of HaVec is that it exhibits no false positive error. Conclusions. In general, the graph construction procedure takes the major share of the time involved in an assembly process. HaVec can be seen as a significant advancement in this aspect. We anticipate that HaVec will be extremely useful in the de Bruijn graph-based genome assembly.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Parallelized De Bruijn graph construction and simplification for genome assembly
    [J]. Cheng, J.-F. (jiefengcheng@gmail.com), 1600, Chinese Academy of Sciences (24):
  • [2] Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly
    Georganas, Evangelos
    Buluc, Aydin
    Chapman, Jarrod
    Oliker, Leonid
    Rokhsar, Daniel
    Yelick, Katherine
    [J]. SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 437 - 448
  • [3] An Efficient GPU-based de Bruijn Graph Construction Algorithm for Micro-Assembly
    Ren, Shanshan
    Ahmed, Nauman
    Bertels, Koen
    Al-Ars, Zaid
    [J]. PROCEEDINGS 2018 IEEE 18TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2018, : 67 - 72
  • [4] Efficient de Bruijn Graph Construction For Genome Assembly Using a Hash Table and Auxiliary Vector Data Structures
    Limon, Mahfuzer Rahman
    Sharker, Ratul
    Biswas, Sajib
    Rahman, M. Sohel
    [J]. 2014 17TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2014, : 121 - 126
  • [5] Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers
    Kanak Mahadik
    Christopher Wright
    Milind Kulkarni
    Saurabh Bagchi
    Somali Chaterji
    [J]. Scientific Reports, 9
  • [6] A New Approach for De Bruijn Graph Construction in De Novo Genome Assembling
    de Armas, Elvismary Molina
    Castro, Liester Cruz
    Holanda, Maristela
    Lifschitz, Sergio
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1842 - 1849
  • [7] Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers
    Mahadik, Kanak
    Wright, Christopher
    Kulkarni, Milind
    Bagchi, Saurabh
    Chaterji, Somali
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [8] A Dynamic Hashing Approach to Build the de Bruijn Graph for Genome Assembly
    Zhao, Kun
    Liu, Weiguo
    Voss, Gerrit
    Mueller-Wittig, Wolfgang
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [9] Genome Polymorphism Detection Through Relaxed de Bruijn Graph Construction
    Fujimoto, M. Stanley
    Lyman, Cole
    Suvorov, Anton
    Bodily, Paul
    Snell, Quinn
    Crandall, Keith
    Bybee, Seth
    Clement, Mark
    [J]. 2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 212 - 216
  • [10] RMI-DBG algorithm: A more agile iterative de Bruijn graph algorithm in short read genome assembly
    Hosseini, Zeinab Zare
    Rahimi, Shekoufeh Kolahdouz
    Forouzan, Esmaeil
    Baraani, Ahmad
    [J]. JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2021, 19 (02)