Overlap graphs and de Bruijn graphs:data structures for de novo genome assembly in the big data era

被引:0
|
作者
Raffaella Rizzi
Stefano Beretta
Murray Patterson
Yuri Pirola
Marco Previtali
Gianluca Della Vedova
Paola Bonizzoni
机构
[1] DepartmentofInformatics,SystemsandCommunications,UniversityofMilan-Bicocca
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Background: De novo genome assembly relies on two kinds of graphs: de Bruijn graphs and overlap graphs. Overlap graphs are the basis for the Celera assembler, while de Bruijn graphs have become the dominant technical device in the last decade. Those two kinds of graphs are collectively called assembly graphs.Results: In this review, we discuss the most recent advances in the problem of constructing, representing and navigating assembly graphs, focusing on very large datasets. We will also explore some computational techniques,such as the Bloom filter, to compactly store graphs while keeping all functionalities intact.Conclusions: We complete our analysis with a discussion on the algorithmic issues of assembling from long reads(e.g.,Pac Bio and Oxford Nanopore). Finally, we present some of the most relevant open problems in this field.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Overlap graphs and de Bruijn graphs: data structures for de novo genome assembly in the big data era
    Rizzi, Raffaella
    Beretta, Stefano
    Patterson, Murray
    Pirola, Yuri
    Previtali, Marco
    Della Vedova, Gianluca
    Bonizzoni, Paola
    [J]. QUANTITATIVE BIOLOGY, 2019, 7 (04) : 278 - 292
  • [2] Combining De Bruijn Graphs, Overlap Graphs and Microassembly for De Novo Genome Assembly
    Sergushichev, A. A.
    Alexandrov, A. V.
    Kazakov, S. V.
    Tsarev, F. N.
    Shalyto, A. A.
    [J]. IZVESTIYA SARATOVSKOGO UNIVERSITETA NOVAYA SERIYA-MATEMATIKA MEKHANIKA INFORMATIKA, 2013, 13 (02): : 10 - 10
  • [3] From Indexing Data Structures to de Bruijn Graphs
    Cazaux, Bastien
    Lecroq, Thierry
    Rivals, Eric
    [J]. COMBINATORIAL PATTERN MATCHING, CPM 2014, 2014, 8486 : 89 - 99
  • [4] How to apply de Bruijn graphs to genome assembly
    Phillip E C Compeau
    Pavel A Pevzner
    Glenn Tesler
    [J]. Nature Biotechnology, 2011, 29 : 987 - 991
  • [5] Integration of string and de Bruijn graphs for genome assembly
    Huang, Yao-Ting
    Liao, Chen-Fu
    [J]. BIOINFORMATICS, 2016, 32 (09) : 1301 - 1307
  • [6] How to apply de Bruijn graphs to genome assembly
    Compeau, Phillip E. C.
    Pevzner, Pavel A.
    Tesler, Glenn
    [J]. NATURE BIOTECHNOLOGY, 2011, 29 (11) : 987 - 991
  • [7] De novo assembly and genotyping of variants using colored de Bruijn graphs
    Zamin Iqbal
    Mario Caccamo
    Isaac Turner
    Paul Flicek
    Gil McVean
    [J]. Nature Genetics, 2012, 44 : 226 - 232
  • [8] De novo assembly and genotyping of variants using colored de Bruijn graphs
    Iqbal, Zamin
    Caccamo, Mario
    Turner, Isaac
    Flicek, Paul
    McVean, Gil
    [J]. NATURE GENETICS, 2012, 44 (02) : 226 - 232
  • [9] Linking indexing data structures to de Bruijn graphs: Construction and update
    Cazaux, Bastien
    Lecroq, Thierry
    Rivals, Eric
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2019, 104 : 165 - 183
  • [10] Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
    Zerbino, Daniel R.
    Birney, Ewan
    [J]. GENOME RESEARCH, 2008, 18 (05) : 821 - 829