Improving an algorithm for approximate pattern matching

被引:16
|
作者
Navarro, G [1 ]
BaezaYates, R [1 ]
机构
[1] Univ Chile, Dept Comp Sci, Santiago, Chile
关键词
string matching allowing errors; bit-parallelism; edit distance; approximate matching probability;
D O I
10.1007/s00453-001-0034-6
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We study a recent algorithm for fast on-line approximate string matching. This is the problem of searching a pattern in a text allowing errors in the pattern or in the text. The algorithm is based on a very East kernel which is able to search short patterns using a nondeterministic finite automaton, which is simulated using bit-parallelism, A number of techniques to extend this kernel for longer patterns are presented in that work. However, the techniques can be integrated in many ways and the optimal interplay among them is by no means obvious. The solution to this problem starts at a very low level, by obtaining basic probabilistic information about the problem which was not previously known, and ends integrating analytical results with empirical data to obtain the optimal heuristic. The conclusions obtained via analysis are experimentally confirmed. We also improve many of the techniques and obtain a combined heuristic which is faster than the original work. This work shows an excellent example of a complex and theoretical analysis of algorithms used for design and for practical algorithm engineering, instead of the common practice of first designing an algorithm and then analyzing it.
引用
下载
收藏
页码:473 / 502
页数:30
相关论文
共 50 条
  • [31] Strict approximate pattern matching with general gaps
    Youxi Wu
    Shuai Fu
    He Jiang
    Xindong Wu
    Applied Intelligence, 2015, 42 : 566 - 580
  • [32] A black box for online approximate pattern matching
    Clifford, Raphael
    Efremenko, Klim
    Porat, Benny
    Porat, Ely
    COMBINATORIAL PATTERN MATCHING, 2008, 5029 : 143 - +
  • [33] NetDAP: (δ, γ)-approximate pattern matching with length constraints
    Wu, Youxi
    Fan, Jinquan
    Li, Yan
    Guo, Lei
    Wu, Xindong
    APPLIED INTELLIGENCE, 2020, 50 (11) : 4094 - 4116
  • [34] State Complexity of Neighbourhoods and Approximate Pattern Matching
    Ng, Timothy
    Rappaport, David
    Salomaa, Kai
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2018, 29 (02) : 315 - 329
  • [35] MULTIPLE FILTRATION AND APPROXIMATE PATTERN-MATCHING
    PEVZNER, PA
    WATERMAN, MS
    ALGORITHMICA, 1995, 13 (1-2) : 135 - 154
  • [36] Exact And Approximate Pattern Matching In The Streaming Model
    Porat, Benny
    Porat, Ely
    2009 50TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE: FOCS 2009, PROCEEDINGS, 2009, : 315 - 323
  • [37] A linear size index for approximate pattern matching
    Chan, Ho-Leung
    Lam, Tak-Wah
    Sung, Wing-Kin
    Tama, Siu-Lung
    Wong, Swee-Seong
    JOURNAL OF DISCRETE ALGORITHMS, 2011, 9 (04) : 358 - 364
  • [38] Faster Approximate Pattern Matching: A Unified Approach
    Charalampopoulos, Panagiotis
    Kociumaka, Tomasz
    Wellnitz, Philip
    2020 IEEE 61ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2020), 2020, : 978 - 989
  • [39] State Complexity of Neighbourhoods and Approximate Pattern Matching
    Ng, Timothy
    Rappaport, David
    Salomaa, Kai
    DEVELOPMENTS IN LANGUAGE THEORY (DLT 2015), 2015, 9168 : 389 - 400
  • [40] Reconfigurable approximate pattern matching architectures for nanotechnology
    Annampedu, Viswanath
    Wagh, Meghanad D.
    MICROELECTRONICS JOURNAL, 2007, 38 (03) : 430 - 438