Word-based text compression

被引:0
|
作者
Moffat, Alistair [1 ]
机构
[1] Univ of Melbourne
关键词
Computer Programming--Algorithms - Data Processing--Data Structures - Information Theory--Data Compression - Probability--Random Processes;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression. Here the implementation of models based on recognizing and recording words is considered. Move-to-the-front and several variable-order Markov models have been tested with a number of different data structures, and first the decisions that went into the implementation are discussed and then experimental results are given that show English text being represented in under 2.2 bits per character. Moreover the programs run at speeds comparable to other compression techniques, and are suited for practical use.
引用
收藏
页码:185 / 198
相关论文
共 50 条
  • [41] WordPrep: Word-based Preposition Prediction Tool
    Bhagat, Pooja
    Varde, Aparna S.
    Feldman, Anna
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 2169 - 2176
  • [42] A morph-based and a word-based treebank for Beja
    Kahane, Sylvain
    Vanhove, Martine
    Ziane, Rayan
    Guillaume, Bruno
    [J]. TLT 2021 - 20th International Workshop on Treebanks and Linguistic Theories, Proceedings - To be held as part of SyntaxFest 2021, 2021, : 48 - 60
  • [43] New Word-Based Adaptive Dense Compressors
    Prochazka, Petr
    Holub, Jan
    [J]. COMBINATORIAL ALGORITHMS, 2009, 5874 : 420 - 431
  • [44] A Novel Detection Method for Word-Based DGA
    Yang, Luhui
    Liu, Guangjie
    Zhai, Jiangtao
    Dai, Yuewei
    Yan, Zhaozhi
    Zou, Yuguang
    Huang, Wenchao
    [J]. CLOUD COMPUTING AND SECURITY, PT II, 2018, 11064 : 472 - 483
  • [45] Mathematical structure model for word-based program
    Arai, O
    Fujita, H
    [J]. KNOWLEDGE-BASED SYSTEMS, 2003, 16 (7-8) : 399 - 411
  • [46] A visual word-based leaf classification scheme
    Wang, Jingjing
    [J]. International Journal of Applied Mathematics and Statistics, 2013, 51 (22): : 233 - 240
  • [47] A Bit Progress on Word-Based Language Model
    陈勇
    陈国评
    [J]. Advances in Manufacturing, 2003, (02) : 148 - 155
  • [48] Hindi morphology: A word-based description.
    Shapiro, MC
    [J]. LINGUA, 1999, 108 (01) : 87 - 90
  • [49] Study of word-based chinese document experimental system and Chinese free-text information extraction experiment based on it
    State Key Laboratory of Precision Measurement Technology and Instruments, Beijing100084, China
    [J]. Proc Int Conf Natural Comput, ICNC, (120-123):
  • [50] Study of word-based chinese document experimental system and Chinese free-text information extraction experiment based on it
    Liu, Qian
    Jiao, Hui
    Jia, Hui-bo
    [J]. ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 5, PROCEEDINGS, 2007, : 120 - +