Parameterized pattern matching: Algorithms and applications

被引:95
|
作者
Baker, BS
机构
[1] AT and T Bell Laboratories, Murray Hill, NJ 07974
关键词
D O I
10.1006/jcss.1996.0003
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of finding sections of code that either are identical or are related by the systematic renaming of variables or constants can be modeled in terms of parameterized strings (p-strings) and parameterized matches (p-matches). P-strings are strings over two alphabets, one of which represents parameters. Two p-strings are a parameterized match (p-match) if one p-string is obtained by renaming the parameters of the other by a one-to-one function. In this paper, we investigate parameterized pattern matching via parameterized suffix trees (p-suffix trees). We give two algorithms for constructing p-suffix trees: one (eager) that runs in linear time for fixed alphabets, and another that uses auxiliary data structures and runs in O(n log(n)) time for Variable alphabets, where n is input length. We show that using a p-suffix tree for a pattern p-string P, it is possible to search for all p-matches of P within a text p-string Tin space linear in \P\ and time linear in \T\ for fixed alphabets, or O(\T\ log(min( \P\, sigma)) time and O(\P\) space for variable alphabets, where sigma is the sum of the alphabet sizes. The simpler p-suffix tree construction algorithm eager has been implemented, and experiments show it to be practical. Since it runs faster than predicted by the above worst-case bound, we reanalyze the algorithm and show that eager runs in time O(min(t\S\ + m(t, S) \ t>0) log sigma)), where for an input p-string S, m(t, S) is the number of maximal p-matches of length at least t that occur within S, and sigma is the sum of the alphabet sizes. Experiments with the author's program dup (B. Baker, in ''Comput. Sci. Statist.,'' Vol. 24, 1992) for finding all maximal p-matches within a p-string have found mt t, S) to be less than \S\ in practice unless t is small. (C) 1996 Academic Press, Inc.
引用
收藏
页码:28 / 42
页数:15
相关论文
共 50 条
  • [31] ALGORITHMS FOR JUMBLED PATTERN MATCHING IN STRINGS
    Burcsi, Peter
    Cicalese, Ferdinando
    Fici, Gabriele
    Liptak, Zsuzsanna
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2012, 23 (02) : 357 - 374
  • [32] Experimenting with pattern-matching algorithms
    Manolopoulos, Y
    Faloutsos, C
    INFORMATION SCIENCES, 1996, 90 (1-4) : 75 - 89
  • [33] Parameterized Algorithms for Disjoint Matchings in Weighted Graphs with Applications
    Chen, Zhi-Zhong
    Tsukiji, Tatsuie
    Yamada, Hiroki
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (06): : 1050 - 1058
  • [34] δγ - Parameterized Matching
    Lee, Inbok
    Mendivelso, Juan
    Pinzon, Yoan J.
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2008, 5280 : 236 - +
  • [35] Brief Announcement: New Streaming Algorithms for Parameterized Maximal Matching & Beyond
    Chitnis, Rajesh
    Cormode, Graham
    Esfandiari, Hossein
    MohammadTaghi
    Monemizadeh, Morteza
    SPAA'15: PROCEEDINGS OF THE 27TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2015, : 56 - 58
  • [36] Study of Bit-Parallel Approximate Parameterized String Matching Algorithms
    Prasad, Rajesh
    Agarwal, Suneeta
    CONTEMPORARY COMPUTING, PROCEEDINGS, 2009, 40 : 26 - 36
  • [37] On the parameterized complexity of d-dimensional point set pattern matching
    Cabello, Sergio
    Giannopoulos, Panos
    Knauer, Christian
    PARAMETERIZED AND EXACT COMPUTATION,PROCEEDINGS, 2006, 4169 : 175 - 183
  • [38] On the parameterized complexity of d-dimensional point set pattern matching
    Cabello, Sergio
    Giannopoulos, Panos
    Knauer, Christian
    INFORMATION PROCESSING LETTERS, 2007, 105 (02) : 73 - 77
  • [39] Pattern Matching Based Algorithms for Graph Compression
    Chatterjee, Amlan
    Shah, Rushabh Jitendrakumar
    Sen, Soumya
    2018 FOURTH IEEE INTERNATIONAL CONFERENCE ON RESEARCH IN COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (ICRCICN), 2018, : 93 - 97
  • [40] DISTRIBUTED ALGORITHMS FOR TREE PATTERN-MATCHING
    SINGH, G
    SMOLKA, SA
    RAMAKRISHNAN, IV
    LECTURE NOTES IN COMPUTER SCIENCE, 1988, 312 : 92 - 107