Succinct dynamic de Bruijn graphs

被引:7
|
作者
Alipanahi, Bahar [1 ]
Kuhnle, Alan [2 ]
Puglisi, Simon J. [3 ]
Salmela, Leena [3 ]
Boucher, Christina [1 ]
机构
[1] Univ Florida, Coll Engn, Dept Comp & Informat Sci & Engn, Gainesville, FL 32611 USA
[2] Florida State Univ, Dept Comp Sci, Tallahassee, FL 32306 USA
[3] Univ Helsinki, Helsinki Inst Informat Technol, Dept Comp Sci, Helsinki 00014, Finland
基金
芬兰科学院;
关键词
D O I
10.1093/bioinformatics/btaa546
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The de Bruijn graph is one of the fundamental data structures for analysis of high throughput sequencing data. In order to be applicable to population-scale studies, it is essential to build and store the graph in a space- and time-efficient manner. In addition, due to the ever-changing nature of population studies, it has become essential to update the graph after construction, e.g. add and remove nodes and edges. Although there has been substantial effort on making the construction and storage of the graph efficient, there is a limited amount of work in building the graph in an efficient and mutable manner. Hence, most space efficient data structures require complete reconstruction of the graph in order to add or remove edges or nodes. Results: In this article, we present DynamicBOSS, a succinct representation of the de Bruijn graph that allows for an unlimited number of additions and deletions of nodes and edges. We compare our method with other competing methods and demonstrate that DynamicBOSS is the only method that supports both addition and deletion and is applicable to very large samples (e.g. greater than 15 billion k-mers). Competing dynamic methods, e.g. FDBG cannot be constructed on large scale datasets, or cannot support both addition and deletion, e.g. BiFrost.
引用
收藏
页码:1946 / 1952
页数:7
相关论文
共 50 条
  • [1] Succinct colored de Bruijn graphs
    Muggli, Martin D.
    Bowe, Alexander
    Noyes, Noelle R.
    Morley, Paul S.
    Belk, Keith E.
    Raymond, Robert
    Gagie, Travis
    Puglisi, Simon J.
    Boucher, Christina
    [J]. BIOINFORMATICS, 2017, 33 (20) : 3181 - 3187
  • [2] Practical dynamic de Bruijn graphs
    Crawford, Victoria G.
    Kuhnle, Alan
    Boucher, Christina
    Chikhi, Rayan
    Gagie, Travis
    [J]. BIOINFORMATICS, 2018, 34 (24) : 4189 - 4195
  • [3] Fully Dynamic de Bruijn Graphs
    Belazzougui, Djamal
    Gagie, Travis
    Makinen, Veli
    Previtali, Marco
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2016, 2016, 9954 : 145 - 152
  • [4] De Bruijn sequences and De Bruijn graphs for a general language
    Moreno, E
    [J]. INFORMATION PROCESSING LETTERS, 2005, 96 (06) : 214 - 219
  • [5] Buffering updates enables efficient dynamic de Bruijn graphs
    Alanko, Jarno
    Alipanahi, Bahar
    Settle, Jonathen
    Boucher, Christina
    Gagie, Travis
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 4067 - 4078
  • [6] Generalized de Bruijn graphs
    F. M. Malyshev
    V. E. Tarakanov
    [J]. Mathematical Notes, 1997, 62 : 449 - 456
  • [7] On the Representation of de Bruijn Graphs
    Chikhi, Rayan
    Limasset, Antoine
    Jackman, Shaun
    Simpson, Jared T.
    Medvedev, Paul
    [J]. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB2014, 2014, 8394 : 35 - 55
  • [8] Enhanced de Bruijn graphs
    Guzide, O
    Wagh, MD
    [J]. AMCS '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON ALGORITHMIC MATHEMATICS AND COMPUTER SCIENCE, 2005, : 23 - 28
  • [9] Generalized de Bruijn graphs
    Malyshev, FM
    Tarakanov, VE
    [J]. MATHEMATICAL NOTES, 1997, 62 (3-4) : 449 - 456
  • [10] On hypercubes in de Bruijn graphs
    Andreae, Thomas
    Hintz, Martin
    [J]. Parallel Processing Letters, 1998, 8 (02): : 259 - 268