Efficient algorithms for (δ, γ, α) and (δ, kΔ, α)-matching

被引:4
|
作者
Fredriksson, Kimmo [1 ]
Grabowski, Szymon [2 ]
机构
[1] Univ Joensuu, Dept Comp Sci & Stat, FIN-80101 Joensuu, Finland
[2] Tech Univ Lodz, Dept Comp Engn, PL-90924 Lodz, Poland
关键词
approximate string matching; music information retrieval; computational biology; bit-parallelism; sparse dynamic programming; bounded gaps;
D O I
10.1142/S0129054108005607
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We propose new algorithms for (delta, gamma, alpha)-matching. In this string matching problem we axe given a pattern P = p(0)p(1)...p(m-1) and a text T = t(0)t(1)...t(n-1) over some integer alphabet Sigma = {0...sigma - 1}. The pattern symbol p(i) delta-matches the text symbol t(j) iff vertical bar p(i) - t(j)vertical bar <= delta. The pattern P (delta, gamma)-matches some text substring t(j)... t(j+m-1) iff for all i it holds that vertical bar p(i) - t(j+1)vertical bar <= delta and Sigma vertical bar p(i) - t(j+i)vertical bar <= gamma. Finally, in (delta, gamma, alpha)-matching we also permit at most alpha-symbol gaps between each matching text symbol. The only known previous algorithm runs in O(nm) time. We give several algorithms that improve the average case up to O(n) for small alpha, and the worst case to O(min{nm, vertical bar M vertical bar alpha}) or O(nm log(gamma)/w), where M = {(i, j) vertical bar vertical bar p(i) - t(j)vertical bar <= delta} and w is the number of bits in a machine word. The proposed algorithms can be easily modified to solve several other related problems, we explicitly consider e.g. character classes (instead of delta-matching), (Delta-limited) k-mismatches (instead of gamma-matching) and more general gaps, including negative ones. These find important applications in computational biology. We conclude with experimental results showing that the algorithms are very efficient in practice.
引用
收藏
页码:163 / 183
页数:21
相关论文
共 50 条
  • [1] EFFICIENT RANDOMIZED DICTIONARY MATCHING ALGORITHMS
    AMIR, A
    FARACH, M
    MATIAS, Y
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 644 : 262 - 275
  • [2] EFFICIENT PATTERN MATCHING ALGORITHMS IN IDS
    Salve, Vandana B.
    Savalkar, Vishwayogita
    Mhatre, Sonali
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INVENTIVE SYSTEMS AND CONTROL (ICISC 2018), 2018, : 1083 - 1089
  • [3] Efficient parallel algorithms for template matching
    Rajasekaran, Sanguthevar
    Parallel Processing Letters, 2002, 12 (3-4) : 359 - 364
  • [4] AN EFFICIENT SYNTHESIS OF IMAGE MATCHING ALGORITHMS
    KOCIS, I
    KIEM, H
    KHOI, PN
    COMPUTERS AND ARTIFICIAL INTELLIGENCE, 1986, 5 (05): : 443 - 450
  • [5] Efficient algorithms for robust feature matching
    Mount, DM
    Netanyahu, NS
    Le Moigne, J
    PATTERN RECOGNITION, 1999, 32 (01) : 17 - 38
  • [6] Efficient bit-parallel algorithms for (δ, α)-matching
    Fredriksson, Kimmo
    Grabowski, Szymon
    EXPERIMENTAL ALGORITHMS, PROCEEDINGS, 2006, 4007 : 170 - 181
  • [7] Efficient algorithms for approximate string matching with swaps
    Kim, DK
    Lee, JS
    Park, K
    Cho, Y
    JOURNAL OF COMPLEXITY, 1999, 15 (01) : 128 - 147
  • [8] Efficient algorithms for Petersen's matching theorem
    Biedl, TC
    Bese, P
    Demaine, ED
    Lubiw, A
    PROCEEDINGS OF THE TENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 1999, : 130 - 139
  • [9] Efficient algorithms for approximate string matching with swaps
    Lee, JS
    Kim, DK
    Park, K
    Cho, Y
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 1997, 1264 : 28 - 39
  • [10] EFFICIENT ALGORITHMS FOR FINDING MAXIMUM MATCHING IN GRAPHS
    GALIL, Z
    COMPUTING SURVEYS, 1986, 18 (01) : 23 - 38