Off-line dictionary-based compression

被引:162
|
作者
Larsson, NJ
Moffat, A
机构
[1] Lund Univ, Dept Comp Sci, S-22100 Lund, Sweden
[2] Univ Melbourne, Dept Comp Sci & Software Engn, Melbourne, Vic 3010, Australia
基金
瑞典研究理事会; 澳大利亚研究理事会;
关键词
dictionary-based modeling; hierarchical modeling; phrase-based compression; text compression;
D O I
10.1109/5.892708
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Dictionary-based modeling is a mechanism used in many practical compression schemes. In most implementations of dictionary-based compression the encoder operates on-line, incrementally inferring its dictionary of available phrases from previous parts of the message. An alternative appproach is to use the full message to infer a complete dictionary in advance, and include an explicit representation of the dictionary as part of the compressed message. In this investigation, we develop a compression scheme that is a combination of a simple but powerful phrase derivation method and a compact dictionary encoding. The scheme is highly efficient, particularly in decompression, and has characteristics that make it a favourable choice when compressed data is to be searched directly. We describe data structures and algorithms that allow our mechanism to operate in linear time and space.
引用
收藏
页码:1722 / 1732
页数:11
相关论文
共 50 条
  • [1] Offline dictionary-based compression
    Larsson, NJ
    Moffat, A
    [J]. DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 296 - 305
  • [2] Programmability in dictionary-based compression
    Heikkinen, Jari
    Takala, Janno
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP PROCEEDINGS, 2006, : 171 - +
  • [3] Revisiting dictionary-based compression
    Skibinski, P
    Grabowski, S
    Deorowicz, S
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 2005, 35 (15): : 1455 - 1476
  • [4] SE-Compression: A Generalization of Dictionary-Based Compression
    Popa, Ionut
    [J]. COMPUTER JOURNAL, 2011, 54 (11): : 1876 - 1881
  • [5] Dictionary-based fast transform for text compression
    Sun, WF
    Zhang, N
    Mukherjee, A
    [J]. ITCC 2003: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: COMPUTERS AND COMMUNICATIONS, PROCEEDINGS, 2003, : 176 - 182
  • [6] Lossy dictionary-based image compression method
    Dudek, Gabriela
    Borys, Przemyslaw
    Grzywna, Zbigniew J.
    [J]. IMAGE AND VISION COMPUTING, 2007, 25 (06) : 883 - 889
  • [7] Off-line compression by extensible motifs
    Apostolico, A
    Comin, M
    Parida, L
    [J]. DCC 2005: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2005, : 450 - 450
  • [8] Block merging for off-line compression
    Wan, R
    Moffat, A
    [J]. COMBINATORIAL PATTERN MATCHING, 2002, 2373 : 32 - 41
  • [9] Block merging for off-line compression
    Wan, Raymond
    Moffat, Alistair
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (01): : 3 - 14
  • [10] Sample Selection for Dictionary-Based Corpus Compression
    Hoobin, Christopher
    Puglisi, Simon
    Zobel, Justin
    [J]. PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1137 - 1138