Simple and efficient LZW-compressed multiple pattern matching

被引:3
|
作者
Gawrychowski, Pawel [1 ,2 ]
机构
[1] Univ Wroclaw, Inst Comp Sci, Wroclaw, Poland
[2] Max Planck Inst Informat, Saarbrucken, Germany
关键词
Multiple pattern matching; Lempel-Ziv-Welch compression;
D O I
10.1016/j.jda.2013.10.004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider a natural variant of the classical multiple pattern matching problem: given a Lempel-Ziv-Welch representation of a string and a collection of (uncompressed) patterns, does any of them occur in the text? As shown by Kida et al. [15], extending the single pattern algorithm of Amir, Benson and Farach [2] gives a running time of O(n + M-2) for the more general case, where n is the number of codewords in the compressed representation of the text and M is the sum of the length of all patterns. We prove that in fact it is possible to achieve O(n log M + M) or O(n + M1+epsilon) complexity. While not linear, running times of our solutions match the single pattern bounds achieved by the previously known solutions [2,17] in a more structured and unified manner, and without using any combinatorics on words. The only nontrivial components of our method are suffix arrays, constant time range minimum queries, and balanced binary search trees. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:34 / 41
页数:8
相关论文
共 50 条
  • [41] Fast Pattern Matching in Compressed Data Packages
    Berger, Michael S.
    Mortensen, Brian B.
    [J]. 2010 IEEE GLOBECOM WORKSHOPS, 2010, : 1591 - 1595
  • [42] Faster Fully Compressed Pattern Matching by Recompression
    Jez, Artur
    [J]. AUTOMATA, LANGUAGES, AND PROGRAMMING, ICALP 2012 PT I, 2012, 7391 : 533 - 544
  • [43] Faster Fully Compressed Pattern Matching by Recompression
    Jez, Artur
    [J]. ACM TRANSACTIONS ON ALGORITHMS, 2015, 11 (03)
  • [44] On Performance of Compressed Pattern Matching on VF Codes
    Yoshida, Satoshi
    Kida, Takuya
    [J]. 2011 DATA COMPRESSION CONFERENCE (DCC), 2011, : 486 - 486
  • [45] Pattern matching in text compressed with the ID heuristic
    Barcaccia, P
    Cresti, A
    De Agostino, S
    [J]. DCC '98 - DATA COMPRESSION CONFERENCE, 1998, : 113 - 118
  • [46] EFFICIENT MULTIPLE PATTERN MATCHING ALGORITHMS FOR NETWORK INTRUSION DETECTION SYSTEMS
    Lee, Sunho
    Kim, Dong Kyue
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT, PROCEEDINGS, 2009, : 609 - 613
  • [47] Efficient Regular Expression Matching on Compressed Strings
    Han, Yutong
    Wang, Bin
    Yang, Xiaochun
    Zhu, Huaijie
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), PT II, 2017, 10178 : 219 - 234
  • [48] An efficient pattern matching algorithm
    Sleit, Azzam
    AlMobaideen, Wesam
    Baarah, Aladdin H.
    Abusitta, Adel H.
    [J]. Journal of Applied Sciences, 2007, 7 (18) : 2691 - 2695
  • [49] Efficient Approximate Substring Matching in Compressed String
    Han, Yutong
    Wang, Bin
    Yang, Xiaochun
    [J]. Web-Age Information Management, Pt II, 2016, 9659 : 184 - 197
  • [50] Efficient string matching in Huffman compressed texts
    Fredriksson, K
    Tarhio, J
    [J]. FUNDAMENTA INFORMATICAE, 2004, 63 (01) : 1 - 16