Approximate all-pairs suffix/prefix overlaps

被引:10
|
作者
Valimaki, Niko [1 ]
Ladra, Susana [2 ]
Makinen, Veli [1 ]
机构
[1] Univ Helsinki, Dept Comp Sci, HIIT, FIN-00014 Helsinki, Finland
[2] Univ A Coruna, Dept Comp Sci, La Coruna, Spain
基金
芬兰科学院; 欧洲研究理事会;
关键词
Suffix/prefix matching; Approximate pattern matching; ALGORITHMS; ALIGNMENT; GENOME; ULTRAFAST; TOOL;
D O I
10.1016/j.ic.2012.02.002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Finding approximate overlaps is the first phase of many sequence assembly methods. Given a set of strings of total length n and an error-rate epsilon, the goal is to find, for all-pairs of strings, their suffix/prefix matches (overlaps) that are within edit distance k = inverted right perpendicular epsilon linverted left perpendicular, where e is the length of the overlap. We propose a new solution for this problem based on backward backtracking (Lam, et al., 2008) and suffix filters (Karkkainen and Na, 2008). Our technique uses nH(k) + o(n log sigma) + r log r bits of space, where H-k is the k-th order entropy and sigma the alphabet size. In practice, it is more scalable in terms of space, and comparable in terms of time, than q-gram filters (Rasmussen, et al., 2006). Our method is also easy to parallelize and scales up to millions of DNA reads. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:49 / 58
页数:10
相关论文
共 50 条
  • [41] Algorithms for all-pairs reliable quickest paths
    Bang, YC
    Rao, NSV
    Radhakrishnan, S
    COMPUTATIONAL SCIENCE - ICCS 2003, PT II, PROCEEDINGS, 2003, 2658 : 678 - 684
  • [42] All-pairs nearest neighbors in a mobile environment
    Gunopulos, D
    Kollios, G
    Tsotras, VJ
    ADVANCES IN INFORMATICS, 2000, : 111 - 121
  • [43] Decremental All-Pairs ALL Shortest Paths and Betweenness Centrality
    Nasre, Meghana
    Pontecorvi, Matteo
    Ramachandran, Vijaya
    ALGORITHMS AND COMPUTATION, ISAAC 2014, 2014, 8889 : 766 - 778
  • [44] All-Pairs Shortest Paths with a Sublinear Additive Error
    Roditty, Liam
    Shapira, Asaf
    ACM TRANSACTIONS ON ALGORITHMS, 2011, 7 (04)
  • [45] External matrix multiplication and all-pairs shortest path
    Sibeyn, JF
    INFORMATION PROCESSING LETTERS, 2004, 91 (02) : 99 - 106
  • [46] Scalable All-Pairs Similarity Search in Metric Spaces
    Wang, Ye
    Metwally, Ahmed
    Parthasarathy, Srinivasan
    19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 829 - 837
  • [47] A Modified Scheme for All-Pairs Evolving Fuzzy Classifiers
    Xie, Bing-Kun
    Lee, Shie-Jue
    PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2014, : 573 - 578
  • [48] An Efficient Algorithm for All-Pairs Bounded Edge Connectivity
    Shyan Akmal
    Ce Jin
    Algorithmica, 2024, 86 : 1623 - 1656
  • [49] An all-pairs shortest path algorithm for bipartite graphs
    Torgasin, Svetlana
    Zimmermann, Karl-Heinz
    OPEN COMPUTER SCIENCE, 2013, 3 (04) : 149 - 157
  • [50] An Efficient Algorithm for All-Pairs Bounded Edge Connectivity
    Akmal, Shyan
    Jin, Ce
    ALGORITHMICA, 2024, 86 (05) : 1623 - 1656