PAMA: A fast string matching algorithm

被引:0
|
作者
Lu, SF [1 ]
Cao, F [1 ]
Lu, Y [1 ]
机构
[1] Wayne State Univ, Detroit, MI 48202 USA
关键词
D O I
10.1142/S0129054106003875
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
String matching is a fundamental operation in computer science, and its performance has great impact on many applications including database query, text processing, DNA and protein sequence analysis. In this paper, we propose a fast string matching algorithm, PAMA (PAttern MAtching). The shift rule used by PAMA not only subsumes both the bad character rule and the good suffix rule employed by the well-known Boyer-Moore algorithm, but also employs an additional key observation to enable faster shifting during the string matching process. Theoretically, we prove that from the same alignment, the next shift of PAMA will be at least as much as that of the Boyer-Moore algorithm. Experimentally, we show that PAMA indeed significantly outperforms the original Boyer-Moore algorithm in almost all cases, and outperforms other Boyer-Moore variants such as Tuned-BM, Turbo-BM and Horspool for long patterns (length >= 128) or for small alphabets (size < 8).
引用
下载
收藏
页码:357 / 378
页数:22
相关论文
共 50 条