Parameterized pattern matching: Algorithms and applications

被引:95
|
作者
Baker, BS
机构
[1] AT and T Bell Laboratories, Murray Hill, NJ 07974
关键词
D O I
10.1006/jcss.1996.0003
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of finding sections of code that either are identical or are related by the systematic renaming of variables or constants can be modeled in terms of parameterized strings (p-strings) and parameterized matches (p-matches). P-strings are strings over two alphabets, one of which represents parameters. Two p-strings are a parameterized match (p-match) if one p-string is obtained by renaming the parameters of the other by a one-to-one function. In this paper, we investigate parameterized pattern matching via parameterized suffix trees (p-suffix trees). We give two algorithms for constructing p-suffix trees: one (eager) that runs in linear time for fixed alphabets, and another that uses auxiliary data structures and runs in O(n log(n)) time for Variable alphabets, where n is input length. We show that using a p-suffix tree for a pattern p-string P, it is possible to search for all p-matches of P within a text p-string Tin space linear in \P\ and time linear in \T\ for fixed alphabets, or O(\T\ log(min( \P\, sigma)) time and O(\P\) space for variable alphabets, where sigma is the sum of the alphabet sizes. The simpler p-suffix tree construction algorithm eager has been implemented, and experiments show it to be practical. Since it runs faster than predicted by the above worst-case bound, we reanalyze the algorithm and show that eager runs in time O(min(t\S\ + m(t, S) \ t>0) log sigma)), where for an input p-string S, m(t, S) is the number of maximal p-matches of length at least t that occur within S, and sigma is the sum of the alphabet sizes. Experiments with the author's program dup (B. Baker, in ''Comput. Sci. Statist.,'' Vol. 24, 1992) for finding all maximal p-matches within a p-string have found mt t, S) to be less than \S\ in practice unless t is small. (C) 1996 Academic Press, Inc.
引用
收藏
页码:28 / 42
页数:15
相关论文
共 50 条
  • [41] CORRECTNESS AND EFFICIENCY OF PATTERN-MATCHING ALGORITHMS
    COLUSSI, L
    INFORMATION AND COMPUTATION, 1991, 95 (02) : 225 - 251
  • [42] The practical efficiency of convolutions in Pattern Matching algorithms
    Amir, Amihood
    Levy, Avivit
    Reuveni, Liron
    FUNDAMENTA INFORMATICAE, 2008, 84 (01) : 1 - 15
  • [43] Identification of design motifs with pattern matching algorithms
    Kaczor, Olivier
    Gueheneuc, Yann-Gael
    Hamel, Sylvie
    INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (02) : 152 - 168
  • [44] Fast Algorithms for Computing the Statistics of Pattern Matching
    Zhang, Danna
    Jin, Kai
    IEEE ACCESS, 2021, 9 (09): : 114965 - 114976
  • [45] New algorithms for binary jumbled pattern matching
    Giaquinta, Emanuele
    Grabowski, Szymon
    INFORMATION PROCESSING LETTERS, 2013, 113 (14-16) : 538 - 542
  • [46] IMPLEMENTATION OF LAZY PATTERN-MATCHING ALGORITHMS
    LAVILLE, A
    LECTURE NOTES IN COMPUTER SCIENCE, 1988, 300 : 298 - 316
  • [47] A Comparative Study of Pattern Matching Algorithms on Sequences
    Min, Fan
    Wu, Xindong
    ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2009, 5908 : 510 - +
  • [48] EFFICIENT RANDOMIZED PATTERN-MATCHING ALGORITHMS
    KARP, RM
    RABIN, MO
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1987, 31 (02) : 249 - 260
  • [49] Two improved single pattern matching algorithms
    Liu, Chuanhan
    Wang, Yongcheng
    Liu, Derong
    Li, Danglin
    ICAT 2006: 16TH INTERNATIONAL CONFERENCE ON ARTIFICIAL REALITY AND TELEXISTENCE - WORSHOPS, PROCEEDINGS, 2006, : 419 - +
  • [50] Matching Algorithms: Fundamentals, Applications and Challenges
    Ren, Jing
    Xia, Feng
    Chen, Xiangtai
    Liu, Jiaying
    Hou, Mingliang
    Shehzad, Ahsan
    Sultanova, Nargiz
    Kong, Xiangjie
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (03): : 332 - 350