Simple and efficient LZW-compressed multiple pattern matching

被引:3
|
作者
Gawrychowski, Pawel [1 ,2 ]
机构
[1] Univ Wroclaw, Inst Comp Sci, Wroclaw, Poland
[2] Max Planck Inst Informat, Saarbrucken, Germany
关键词
Multiple pattern matching; Lempel-Ziv-Welch compression;
D O I
10.1016/j.jda.2013.10.004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider a natural variant of the classical multiple pattern matching problem: given a Lempel-Ziv-Welch representation of a string and a collection of (uncompressed) patterns, does any of them occur in the text? As shown by Kida et al. [15], extending the single pattern algorithm of Amir, Benson and Farach [2] gives a running time of O(n + M-2) for the more general case, where n is the number of codewords in the compressed representation of the text and M is the sum of the length of all patterns. We prove that in fact it is possible to achieve O(n log M + M) or O(n + M1+epsilon) complexity. While not linear, running times of our solutions match the single pattern bounds achieved by the previously known solutions [2,17] in a more structured and unified manner, and without using any combinatorics on words. The only nontrivial components of our method are suffix arrays, constant time range minimum queries, and balanced binary search trees. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:34 / 41
页数:8
相关论文
共 50 条
  • [1] Empirical evaluation of LZW-Compressed Multiple Pattern Matching Algorithms
    Reja, Mario
    [J]. 2022 24TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, SYNASC, 2022, : 125 - 132
  • [2] Almost optimal fully LZW-compressed pattern matching
    Gasieniec, L
    Rytter, W
    [J]. DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 316 - 325
  • [3] MultiPLZW: A novel multiple pattern matching search in LZW-compressed data
    Aldwairi, Monther
    Hamzah, Abdulmughni Y.
    Jarrah, Moath
    [J]. COMPUTER COMMUNICATIONS, 2019, 145 : 126 - 136
  • [4] Beating O(nm) in Approximate LZW-Compressed Pattern Matching
    Gawrychowski, Pawel
    Straszak, Damian
    [J]. ALGORITHMS AND COMPUTATION, 2013, 8283 : 78 - 88
  • [5] Tying up the loose ends in fully LZW-compressed pattern matching
    Gawrychowski, Pawel
    [J]. 29TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE, (STACS 2012), 2012, 14 : 624 - 635
  • [6] Multiple pattern matching in LZW compressed text
    Kida, T
    Takeda, M
    Shinohara, A
    Miyazaki, M
    Arikawa, S
    [J]. DCC '98 - DATA COMPRESSION CONFERENCE, 1998, : 103 - 112
  • [7] An efficient pattern matching scheme in LZW compressed sequences
    Lee, Tsern-Huei
    Huang, Nai-Lun
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2008, 1 (04) : 325 - 335
  • [8] Multiple-pattern matching for LZW compressed files
    Tao, T
    Mukherjee, A
    [J]. ITCC 2005: International Conference on Information Technology: Coding and Computing, Vol 1, 2005, : 91 - 96
  • [9] Pattern matching in LZW compressed files
    Tao, T
    Mukherjee, A
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2005, 54 (08) : 929 - 938
  • [10] LZW based compressed pattern matching
    Tao, T
    Mukherjee, A
    [J]. DCC 2004: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2004, : 568 - 568