An Efficient Algorithm for Finding All Pairs k-Mismatch Maximal Common Substrings

被引:2
|
作者
Thankachan, Sharma V. [1 ]
Chockalingam, Sriram P. [2 ]
Aluru, Srinivas [1 ]
机构
[1] Georgia Inst Technol, Sch CSE, Atlanta, GA 30332 USA
[2] Indian Inst Technol, Dept CSE, Bombay, Maharashtra, India
关键词
LINEAR-TIME CONSTRUCTION; SUFFIX-ARRAYS;
D O I
10.1007/978-3-319-38782-6_1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identifying long pairwise maximal common substrings among a large set of sequences is a frequently used construct in computational biology, with applications in DNA sequence clustering and assembly. Due to errors made by sequencers, algorithms that can accommodate a small number of differences are of particular interest, but obtaining provably efficient solutions for such problems has been elusive. In this paper, we present a provably efficient algorithm with an expected run time guarantee of O(N log(k) N + occ), where occ is the output size, for the following problem: Given a collection D = {S-1, S-2, ..., S-n} of n sequences of total length N, a length threshold (sic) and a mismatch threshold k >= 0, report all k-mismatch maximal common substrings of length at least (sic) over all pairs of sequences in D. In addition, we present a result showing the hardness of this problem.
引用
收藏
页码:3 / 14
页数:12
相关论文
共 50 条
  • [41] An efficient algorithm to generate all maximal cliques on trapezoid graphs
    Bera, D
    Pal, M
    Pal, T
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2002, 79 (10) : 1057 - 1065
  • [42] A parallel algorithm for enumerating all the maximal k-plexes
    Wu, Bin
    Pei, Xin
    EMERGING TECHNOLOGIES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2007, 4819 : 476 - +
  • [43] An Efficient Algorithm for All-Pairs Bounded Edge Connectivity
    Shyan Akmal
    Ce Jin
    Algorithmica, 2024, 86 : 1623 - 1656
  • [44] An Efficient Algorithm for All-Pairs Bounded Edge Connectivity
    Akmal, Shyan
    Jin, Ce
    ALGORITHMICA, 2024, 86 (05) : 1623 - 1656
  • [45] ALGORITHM FOR FINDING ALL MAXIMAL COMPLETE SUBGRAPHS AND AN ESTIMATE OF THE ORDER OF COMPUTATIONAL COMPLEXITY
    DAS, SR
    SHENG, CL
    CHEN, Z
    COMPUTERS & ELECTRICAL ENGINEERING, 1978, 5 (04) : 365 - 368
  • [46] A BSP/CGM algorithm for finding all maximal contiguous subsequences of a sequence of numbers
    Rodrigues Alves, Carlos Eduardo
    Caceres, Edson Norberto
    Song, Siang Wun
    EURO-PAR 2006 PARALLEL PROCESSING, 2006, 4128 : 831 - 840
  • [47] An efficient algorithm for finding all DC solutions of nonlinear circuits
    Yamamura, Kiyotaka
    Suda, Koki
    Kuroki, Wataru
    WSEAS Transactions on Circuits and Systems, 2006, 5 (07): : 1097 - 1102
  • [48] An efficient distributed algorithm for finding all hinge vertices in networks
    Ho, TY
    Chang, JM
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2005, 82 (07) : 821 - 827
  • [49] An Efficient Algorithm for Finding All Hinge Vertices on Trapezoid Graphs
    Debashis Bera
    Madhumangal Pal
    Tapan K. Pal
    Theory of Computing Systems, 2003, 36 : 17 - 27
  • [50] An efficient algorithm for finding all hinge vertices on trapezoid graphs
    Bera, D
    Pal, M
    Pal, TK
    THEORY OF COMPUTING SYSTEMS, 2003, 36 (01) : 17 - 27