On the complexity of optimal grammar-based compression

被引:0
|
作者
Arpe, Jan [1 ]
Reischuk, R. diger [1 ]
机构
[1] Univ Lubeck, Inst Theoret Informat, Ratzeburger Allee 160, D-23538 Lubeck, Germany
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a string, the task of grammar-based compression is to find a small context-free grammar that generates exactly that string. We investigate the relationship between grammar-based compression of strings over unbounded and bounded alphabets. Specifically, we show how to transform a grammar for a string over an unbounded alphabet into a grammar for a block coding of that string over a fixed bounded alphabet and vice versa. From these constructions, we obtain asymptotically tight relationships between the minimum grammar sizes for strings and their block codings. Furthermore, we exploit an improved bound of our construction for overlap-free block codings to show that a polynomial time algorithm for approximating the minimum grammar for binary strings within a factor of c yields a polynomial time algorithm for approximating the minimum grammar for strings over arbitrary alphabets within a factor of 24c + epsilon (for arbitrary epsilon > 0). Currently, the latter problem is known to be NP-hard to approximate within a factor of 8569/8568. Since there is some hope to prove a nonconstant lower bound, our results may provide a first step towards solving the long standing open question whether minimum grammar-based compression of binary strings is NP-complete.
引用
收藏
页码:173 / +
页数:2
相关论文
共 50 条
  • [1] Grammar-Based Tree Compression
    Lohrey, Markus
    [J]. DEVELOPMENTS IN LANGUAGE THEORY (DLT 2015), 2015, 9168 : 46 - 57
  • [2] Grammar-based graph compression
    Maneth, Sebastian
    Peternek, Fabian
    [J]. INFORMATION SYSTEMS, 2018, 76 : 19 - 45
  • [3] Grammar-Based Compression of Unranked Trees
    Gascon, Adria
    Lohrey, Markus
    Maneth, Sebastian
    Reh, Carl Philipp
    Siebert, Kurt
    [J]. COMPUTER SCIENCE - THEORY AND APPLICATIONS, CSR 2018, 2018, 10846 : 118 - 131
  • [4] RNACompress: Grammar-based compression and informational complexity measurement of RNA secondary structure
    Qi Liu
    Yu Yang
    Chun Chen
    Jiajun Bu
    Yin Zhang
    Xiuzi Ye
    [J]. BMC Bioinformatics, 9
  • [5] Grammar-Based Compression of Unranked Trees
    Gascon, Adria
    Lohrey, Markus
    Maneth, Sebastian
    Reh, Carl Philipp
    Sieber, Kurt
    [J]. THEORY OF COMPUTING SYSTEMS, 2020, 64 (01) : 141 - 176
  • [6] Grammar-Based Compression of Unranked Trees
    Adrià Gascón
    Markus Lohrey
    Sebastian Maneth
    Carl Philipp Reh
    Kurt Sieber
    [J]. Theory of Computing Systems, 2020, 64 : 141 - 176
  • [7] Grammar-based compression of interpreted code
    Evans, WS
    Fraser, CW
    [J]. COMMUNICATIONS OF THE ACM, 2003, 46 (08) : 61 - 66
  • [8] RNACompress: Grammar-based compression and informational complexity measurement of RNA secondary structure
    Liu, Qi
    Yang, Yu
    Chen, Chun
    Bu, Jiajun
    Zhang, Yin
    Ye, Xiuzi
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [9] Grammar-Based Compression in a Streaming Model
    Gagie, Travis
    Gawrychowski, Pawel
    [J]. LANGUAGE AND AUTOMATA THEORY AND APPLICATIONS, 2010, 6031 : 273 - +
  • [10] Approximation algorithms for grammar-based compression
    Lehman, E
    Shelat, A
    [J]. PROCEEDINGS OF THE THIRTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2002, : 205 - 212