Word-based text compression

被引:0
|
作者
Moffat, Alistair [1 ]
机构
[1] Univ of Melbourne
关键词
Computer Programming--Algorithms - Data Processing--Data Structures - Information Theory--Data Compression - Probability--Random Processes;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression. Here the implementation of models based on recognizing and recording words is considered. Move-to-the-front and several variable-order Markov models have been tested with a number of different data structures, and first the decisions that went into the implementation are discussed and then experimental results are given that show English text being represented in under 2.2 bits per character. Moreover the programs run at speeds comparable to other compression techniques, and are suited for practical use.
引用
收藏
页码:185 / 198
相关论文
共 50 条
  • [1] WORD-BASED TEXT COMPRESSION
    MOFFAT, A
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 1989, 19 (02): : 185 - 198
  • [2] WORD-BASED COMPRESSION TECHNIQUE FOR TEXT FILES
    WEISS, SF
    VERNOR, RL
    [J]. JOURNAL OF LIBRARY AUTOMATION, 1978, 11 (02): : 97 - 105
  • [3] Japanese text compression using word-based coding
    Morihara, T
    Satoh, N
    Yahagi, H
    Yoshida, S
    [J]. DCC '98 - DATA COMPRESSION CONFERENCE, 1998, : 564 - 564
  • [4] Word-based block-sorting text compression
    Isal, RYK
    Moffat, A
    [J]. PROCEEDINGS OF THE 24TH AUSTRALASIAN COMPUTER SCIENCE CONFERENCE, ACSC 2001, 2001, 23 (01): : 92 - 99
  • [5] Word-based compression methods for large text documents
    Dvorsky, J
    Pokorny, J
    Snásel, V
    [J]. DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 523 - 523
  • [6] Word-based compression methods and indexing for text retrieval systems
    Dvorsky, J
    Pokorny, J
    Snásel, V
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 1999, 1691 : 75 - 84
  • [7] Boosting Text Compression with Word-Based Statistical Encoding1
    Farina, Antonio
    Navarro, Gonzalo
    Parama, Jose R.
    [J]. COMPUTER JOURNAL, 2012, 55 (01): : 111 - 131
  • [8] Application of a word-based text compression method to Japanese and Chinese texts
    Yoshida, S
    Morihara, T
    Yahagi, H
    Satoh, N
    [J]. DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, : 561 - 561
  • [9] Word-based text compression using the Burrows-Wheeler transform
    Moffat, A
    Isal, RYK
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (05) : 1175 - 1192
  • [10] Application of a word-based text compression method to Japanese and Chinese texts
    Yoshida, S
    Morihara, T
    Yahagi, H
    Itani, N
    [J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2002, E85A (12) : 2933 - 2938