Improving an algorithm for approximate pattern matching

被引:16
|
作者
Navarro, G [1 ]
BaezaYates, R [1 ]
机构
[1] Univ Chile, Dept Comp Sci, Santiago, Chile
关键词
string matching allowing errors; bit-parallelism; edit distance; approximate matching probability;
D O I
10.1007/s00453-001-0034-6
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We study a recent algorithm for fast on-line approximate string matching. This is the problem of searching a pattern in a text allowing errors in the pattern or in the text. The algorithm is based on a very East kernel which is able to search short patterns using a nondeterministic finite automaton, which is simulated using bit-parallelism, A number of techniques to extend this kernel for longer patterns are presented in that work. However, the techniques can be integrated in many ways and the optimal interplay among them is by no means obvious. The solution to this problem starts at a very low level, by obtaining basic probabilistic information about the problem which was not previously known, and ends integrating analytical results with empirical data to obtain the optimal heuristic. The conclusions obtained via analysis are experimentally confirmed. We also improve many of the techniques and obtain a combined heuristic which is faster than the original work. This work shows an excellent example of a complex and theoretical analysis of algorithms used for design and for practical algorithm engineering, instead of the common practice of first designing an algorithm and then analyzing it.
引用
下载
收藏
页码:473 / 502
页数:30
相关论文
共 50 条
  • [1] Improving an Algorithm for Approximate Pattern Matching
    G. Navarro
    R. Baeza-Yates
    Algorithmica, 2001, 30 : 473 - 502
  • [2] Approximate Pattern Matching Algorithm
    Hurtik, Petr
    Hodakova, Petra
    Perfilieva, Irina
    INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, IPMU 2016, PT I, 2016, 610 : 577 - 587
  • [3] An Efficient Algorithm for Approximate Pattern Matching with Swaps
    Campanelli, Matteo
    Cantone, Domenico
    Faro, Simone
    Giaquinta, Emanuele
    PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2009, 2009, : 90 - 104
  • [4] Efficient Algorithm for δ - Approximate Jumbled Pattern Matching
    Castellanos, Ivan
    Pinzon, Yoan
    PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2015, 2015, : 47 - 56
  • [5] The improving pattern matching algorithm of intrusion detection
    Qu, Zhaoyang
    Huang, Xiaobo
    CEIS 2011, 2011, 15
  • [6] On approximate pattern matching with thresholds
    Zhang, Peng
    Atallah, Mikhail J.
    INFORMATION PROCESSING LETTERS, 2017, 123 : 21 - 26
  • [7] APPROXIMATE PATTERN-MATCHING
    MANBER, U
    WU, S
    BYTE, 1992, 17 (12): : 281 - +
  • [8] Efficient skip-pattern matching algorithm for approximate string sequential problem
    Shen, Zhou
    Wang, Yongcheng
    Liu, Gongshen
    Shu Ju Cai Ji Yu Chu Li/Journal of Data Acquisition and Processing, 2001, 16 (04): : 459 - 465
  • [9] A Simple, Fast, Filter-Based Algorithm for Approximate Circular Pattern Matching
    Azim, Md. Aashikur Rahman
    Iliopoulos, Costas S.
    Rahman, M. Sohel
    Samiruzzaman, M.
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2016, 15 (02) : 95 - 102
  • [10] Approximate pattern matching with gap constraints
    Wu, Youxi
    Tang, Zhiqiang
    Jiang, He
    Wu, Xindong
    JOURNAL OF INFORMATION SCIENCE, 2016, 42 (05) : 639 - 658