The Smallest Grammar Problem Revisited

被引:5
|
作者
Bannai, Hideo [1 ]
Hirayama, Momoko [2 ]
Hucke, Danny [3 ]
Inenaga, Shunsuke [2 ]
Jez, Artur [4 ]
Lohrey, Markus [3 ]
Reh, Carl Philipp [3 ]
机构
[1] Tokyo Med & Dent Univ, Dept Data Sci Algorithm Design & Anal, Tokyo 1138510, Japan
[2] Kyushu Univ, Dept Informat, Fukuoka 8190395, Japan
[3] Univ Siegen, Dept Elect Engn & Comp Sci, D-57076 Siegen, Germany
[4] Univ Wroclaw, Inst Comp Sci, PL-50383 Wroclaw, Poland
关键词
String compression; smallest grammar problem; approximation algorithm; LZ78; RePair; COMPRESSED STRINGS; APPROXIMATION; SEQUENCES;
D O I
10.1109/TIT.2020.3038147
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In a seminal paper, Charikar et al. derive upper and lower bounds on the approximation ratios for several grammar-based compressors, but in all cases there is a gap between the lower and upper bound. Here the gaps for LZ78 and BISECTION are closed by showing that the approximation ratio of LZ78 is Theta((n/log n)(2/3)), whereas the approximation ratio of BISECTION is Theta(n/log n). In addition, the lower bound for RePair is improved from Omega(root log n) to Omega(log n/ log log n). Finally, results of Arpe and Reischuk relating grammar-based compression for arbitrary alphabets and binary alphabets are improved.
引用
收藏
页码:317 / 328
页数:12
相关论文
共 50 条