MiniDBG: A Novel and Minimal De Bruijn Graph for Read Mapping

被引:0
|
作者
Yu, Changyong [1 ]
Zhao, Yuhai [1 ]
Zhao, Chu [1 ]
Jin, Jianyu [1 ]
Mao, Keming [1 ]
Wang, Guoren [2 ]
机构
[1] Northeastern Univ, Coll Comp Sci & Engn, Shenyang 110819, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Bioinformatics; Genomics; Indexing; Costs; Task analysis; Memory management; Data structures; Graph algorithms; Indexing methods; Bioinformatics databases; Queries; ALIGNMENT; GENOMES; ALGORITHM; BUILD; SEED;
D O I
10.1109/TCBB.2023.3340251
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The De Bruijn graph (DBG) has been widely used in the algorithms for indexing or organizing read and reference sequences in bioinformatics. However, a DBG model that can locate each node, edge and path on sequence has not been proposed so far. Recently, DBG has been used for representing reference sequences in read mapping tasks. In this process, it is not a one-to-one correspondence between the paths of DBG and the substrings of reference sequence. This results in the false path on DBG, which means no substrings of reference producing the path. Moreover, if a candidate path of a read is true, we need to locate it and verify the candidate on sequence. To solve these problems, we proposed a DBG model, called MiniDBG, which stores the position lists of a minimal set of edges. With the position lists, MiniDBG can locate any node, edge and path efficiently. We also proposed algorithms for generating MiniDBG based on an original DBG and algorithms for locating edges or paths on sequence. We designed and ran experiments on real datasets for comparing them with BWT-based and position list-based methods. The experimental results show that MiniDBG can locate the edges and paths efficiently with lower memory costs.
引用
收藏
页码:129 / 142
页数:14
相关论文
共 50 条
  • [1] Read mapping on de Bruijn graphs
    Limasset, Antoine
    Cazaux, Bastien
    Rivals, Eric
    Peterlongo, Pierre
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [2] Read mapping on de Bruijn graphs
    Antoine Limasset
    Bastien Cazaux
    Eric Rivals
    Pierre Peterlongo
    [J]. BMC Bioinformatics, 17
  • [3] Heuristics for the de Bruijn Graph Sequence Mapping Problem
    Rocha, Lucas B.
    Adi, Said Sadique
    Araujo, Eloi
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2023, PT I, 2023, 13956 : 152 - 169
  • [4] deBGA: read alignment with de Bruijn graph-based seed and extension
    Liu, Bo
    Guo, Hongzhe
    Brudno, Michael
    Wang, Yadong
    [J]. BIOINFORMATICS, 2016, 32 (21) : 3224 - 3232
  • [5] A De Bruijn Graph Localization Algorithm Based on Minimal Set of Edges
    Yu, Chang-Yong
    Jin, Jian-Yu
    Liu, Peng
    Zhao, Yu-Hai
    [J]. Dongbei Daxue Xuebao/Journal of Northeastern University, 2022, 43 (02): : 153 - 159
  • [6] Cutwidth of the De Bruijn graph
    Raspaud, A
    Sykora, O
    Vrto, I
    [J]. RAIRO-INFORMATIQUE THEORIQUE ET APPLICATIONS-THEORETICAL INFORMATICS AND APPLICATIONS, 1995, 29 (06): : 509 - 514
  • [7] EXTENSION OF DE BRUIJN GRAPH AND KAUTZ GRAPH
    SHIBATA, Y
    GONDA, Y
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1995, 30 (09) : 51 - 61
  • [8] Recoloring the Colored de Bruijn Graph
    Alipanahi, Bahar
    Kuhnle, Alan
    Boucher, Christina
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2018, 2018, 11147 : 1 - 11
  • [9] The directed genus of the de Bruijn graph
    Hales, Alfred W.
    Hartsfield, Nora
    [J]. DISCRETE MATHEMATICS, 2009, 309 (17) : 5259 - 5263
  • [10] ON HOMOMORPHISMS OF DE BRUIJN GOOD GRAPH
    熊荣华
    [J]. Science Bulletin, 1986, (07) : 439 - 442