Compressed parameterized pattern matching

被引:5
|
作者
Beal, Richard [1 ]
Adjeroh, Donald [1 ]
机构
[1] W Virginia Univ, Lane Dept Comp Sci & Elect Engn, Morgantown, WV 26506 USA
基金
美国国家科学基金会;
关键词
Parameterized matching; Compressed pattern matching; Parameterized string; Lossless compression; Parameterized arithmetic coding; Parameterized border; p-match; p-string; pAC; p-border; Catenate; Tunstall codes; Huffman codes; LZSS; ALGORITHMS; CONSTRUCTION; TREES;
D O I
10.1016/j.tcs.2015.09.015
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Pattern matching between traditional strings is well-defined for both uncompressed and compressed sequences. Prior to this work, parameterized pattern matching (p-matching) was defined predominately by the matching between uncompressed parameterized strings (p-strings) from the constant alphabet Sigma and the parameter alphabet Pi. In this work, we define the compressed parameterized pattern matching (compressed p-matching) problem to find all of the p-matches between a pattern P and text T, using only P and the compressed text T-c. Initially, we present parameterized compression (p-compression) as a new way to losslessly compress data. Experimentally, we show that p-compression is competitive with various other standard compression schemes. Subsequently, we provide the compression and decompression algorithms. Next, two different approaches are developed to address the compressed p-matching problem: (1) using the recently proposed parameterized arithmetic codes (pAC) and (2) using the parameterized border array (p-border). Our general solution is independent of the underlying compression scheme. The results are further examined for catenate, Tunstall codes, Huffman codes, and LZSS. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:129 / 142
页数:14
相关论文
共 50 条
  • [1] Compressed Parameterized Pattern Matching
    Beal, Richard
    Adjeroh, Donald A.
    [J]. 2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 461 - 470
  • [2] Secure parameterized pattern matching
    Zarezadeh, Maryam
    Mala, Hamid
    Ladani, Behrouz Tork
    [J]. INFORMATION SCIENCES, 2020, 522 : 299 - 316
  • [3] Parameterized pattern matching: Algorithms and applications
    Baker, BS
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1996, 52 (01) : 28 - 42
  • [4] Compressed pattern matching for SEQUITUR
    Mitarai, S
    Hirao, M
    Matsumoto, T
    Shinohara, A
    Takeda, M
    Arikawa, S
    [J]. DCC 2001: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2001, : 469 - 478
  • [5] Compressed Consecutive Pattern Matching
    Gawrychowski, Pawel
    Gourdel, Garance
    Starikovskaya, Tatiana
    Steiner, Teresa Anna
    [J]. 2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 163 - 172
  • [6] Efficient Parameterized Pattern Matching in Sublinear Space
    Ideguchi, Haruki
    Hendrian, Diptarama
    Yoshinaka, Ryo
    Shinohara, Ayumi
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2023, 2023, 14240 : 271 - 283
  • [7] Efficient Multi-word Parameterized Matching on Compressed Text
    Prasad, Rajesh
    Garg, Rama
    [J]. PROCEEDINGS OF THE 2014 IEEE 6TH INTERNATIONAL CONFERENCE ON ADAPTIVE SCIENCE AND TECHNOLOGY (ICAST 2014), 2014,
  • [8] Pattern matching in LZW compressed files
    Tao, T
    Mukherjee, A
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2005, 54 (08) : 929 - 938
  • [9] Compressed Pattern Matching in Dna Sequences
    Kanchana, N.
    Sarala, S.
    [J]. PROCEEDINGS OF 2010 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 9 (ICCSIT 2010), 2010, : 157 - 160
  • [10] Direct pattern matching on compressed text
    de Moura, ES
    Navarro, G
    Ziviani, N
    Baeza-Yates, R
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL - PROCEEDINGS: A SOUTH AMERICAN SYMPOSIUM, 1998, : 90 - 95