Simple and efficient LZW-compressed multiple pattern matching

被引:3
|
作者
Gawrychowski, Pawel [1 ,2 ]
机构
[1] Univ Wroclaw, Inst Comp Sci, Wroclaw, Poland
[2] Max Planck Inst Informat, Saarbrucken, Germany
关键词
Multiple pattern matching; Lempel-Ziv-Welch compression;
D O I
10.1016/j.jda.2013.10.004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider a natural variant of the classical multiple pattern matching problem: given a Lempel-Ziv-Welch representation of a string and a collection of (uncompressed) patterns, does any of them occur in the text? As shown by Kida et al. [15], extending the single pattern algorithm of Amir, Benson and Farach [2] gives a running time of O(n + M-2) for the more general case, where n is the number of codewords in the compressed representation of the text and M is the sum of the length of all patterns. We prove that in fact it is possible to achieve O(n log M + M) or O(n + M1+epsilon) complexity. While not linear, running times of our solutions match the single pattern bounds achieved by the previously known solutions [2,17] in a more structured and unified manner, and without using any combinatorics on words. The only nontrivial components of our method are suffix arrays, constant time range minimum queries, and balanced binary search trees. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:34 / 41
页数:8
相关论文
共 50 条
  • [31] Direct pattern matching on compressed text
    de Moura, ES
    Navarro, G
    Ziviani, N
    Baeza-Yates, R
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL - PROCEEDINGS: A SOUTH AMERICAN SYMPOSIUM, 1998, : 90 - 95
  • [32] Compressed pattern matching in JPEG images
    Klein, ST
    Shapira, D
    [J]. DCC 2005: Data Compression Conference, Proceedings, 2005, : 466 - 466
  • [33] Compressed Indexes for Aligned Pattern Matching
    Thankachan, Sharma V.
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, 2011, 7024 : 410 - 419
  • [34] Compressed pattern matching in DNA sequence
    Chen, L
    Lu, SY
    Ram, J
    [J]. 2004 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2004, : 62 - 68
  • [35] Pattern Matching in Compressed Texts and Images
    Adjeroh, Don
    Bell, Tim
    Mukherjee, Amar
    [J]. FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, 2012, 6 (2-3): : 97 - 241
  • [36] Compressed pattern matching in JPEG images
    Klein, Shmuel T.
    Shapira, Dana
    [J]. INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2006, 17 (06) : 1297 - 1306
  • [37] ANALYZING THE PERFORMANCE DIFFERENCES BETWEEN PATTERN MATCHING AND COMPRESSED PATTERN MATCHING ON TEXTS
    Erdogan, Cihat
    Bulus, H. Nusret
    Diri, Banu
    [J]. 2013 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTER AND COMPUTATION (ICECCO), 2013, : 135 - 138
  • [38] Compressed and fully compressed pattern matching in one and two dimensions
    Rytter, W
    [J]. PROCEEDINGS OF THE IEEE, 2000, 88 (11) : 1769 - 1778
  • [39] A memory efficient multiple pattern matching architecture for network security
    Song, Tian
    Zhang, Wei
    Wang, Dongsheng
    Xue, Yibo
    [J]. 27TH IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (INFOCOM), VOLS 1-5, 2008, : 673 - 681
  • [40] Multiple Pattern Matching
    Fulwider, Stephen
    Mukherjee, Amar
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCES ON PERVASIVE PATTERNS AND APPLICATIONS (PATTERNS 2010), 2010, : 78 - 83