Revisiting dictionary-based compression

被引:26
|
作者
Skibinski, P
Grabowski, S
Deorowicz, S
机构
[1] Tech Univ Lodz, Dept Comp Engn, PL-90924 Lodz, Poland
[2] Univ Wroclaw, Inst Comp Sci, PL-51151 Wroclaw, Poland
[3] Silesian Tech Univ, Inst Comp Sci, PL-44100 Gliwice, Poland
来源
SOFTWARE-PRACTICE & EXPERIENCE | 2005年 / 35卷 / 15期
关键词
lossless data compression; preprocessing; text compression; dictionary compression;
D O I
10.1002/spe.678
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
An attractive way to increase text compression is to replace words with references to a text dictionary given in advance. Although there exist a few works in this area, they do not fully exploit the compression possibilities or consider alternative preprocessing variants for various compressors in the latter phase. In this paper, we discuss several aspects of dictionary-based compression, including compact dictionary representation, and present a PPM/BWCA-oriented scheme, word replacing transformation, achieving compression ratios higher by 2-6% than the state-of-the-art StarNT (2003) text preprocessor, working at a greater speed. We also present an alternative scheme designed for LZ77 compressors, with the advantage over StarNT of reaching up to 14% in combination with gzip. Copyright (c) 2005 John Wiley & Sons, Ltd.
引用
收藏
页码:1455 / 1476
页数:22
相关论文
共 50 条
  • [1] Offline dictionary-based compression
    Larsson, NJ
    Moffat, A
    [J]. DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 296 - 305
  • [2] Programmability in dictionary-based compression
    Heikkinen, Jari
    Takala, Janno
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP PROCEEDINGS, 2006, : 171 - +
  • [3] SE-Compression: A Generalization of Dictionary-Based Compression
    Popa, Ionut
    [J]. COMPUTER JOURNAL, 2011, 54 (11): : 1876 - 1881
  • [4] Dictionary-based fast transform for text compression
    Sun, WF
    Zhang, N
    Mukherjee, A
    [J]. ITCC 2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: COMPUTERS AND COMMUNICATIONS, PROCEEDINGS, 2003, : 176 - 182
  • [5] Lossy dictionary-based image compression method
    Dudek, Gabriela
    Borys, Przemyslaw
    Grzywna, Zbigniew J.
    [J]. IMAGE AND VISION COMPUTING, 2007, 25 (06) : 883 - 889
  • [6] Off-line dictionary-based compression
    Larsson, NJ
    Moffat, A
    [J]. PROCEEDINGS OF THE IEEE, 2000, 88 (11) : 1722 - 1732
  • [7] Fast Dictionary-Based Compression for Inverted Indexes
    Pibiri, Giulio Ermanno
    Petri, Matthias
    Moffat, Alistair
    [J]. PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 6 - 14
  • [8] Sample Selection for Dictionary-Based Corpus Compression
    Hoobin, Christopher
    Puglisi, Simon
    Zobel, Justin
    [J]. PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1137 - 1138
  • [9] Dictionary-based order-preserving string compression
    Antoshenkov G.
    [J]. The VLDB Journal, 1997, 6 (1) : 26 - 39
  • [10] Dictionary-based program compression on customizable processor architectures
    Heikkinen, Jari
    Takala, Jarmo
    Corporaal, Henk
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2009, 33 (02) : 139 - 153