DZip: improved general-purpose lossless compression based on novel neural network modeling

被引:19
|
作者
Goyal, Mohit [1 ]
Tatwawadi, Kedar [2 ]
Chandak, Shubham [2 ]
Ochoa, Idoia [1 ,3 ]
机构
[1] Univ Illinois, Elect & Comp Engn, Urbana, IL 61801 USA
[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[3] Univ Navarra, Dept Elect Engn, Pamplona, Spain
关键词
D O I
10.1109/DCC50243.2021.00023
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consider lossless compression based on statistical data modeling followed by prediction-based encoding, where an accurate statistical model for the input data leads to substantial improvements in compression. We propose DZip, a general-purpose compressor for sequential data that exploits the well-known modeling capabilities of neural networks (NNs) for prediction, followed by arithmetic coding. DZip uses a novel hybrid architecture based on adaptive and semi-adaptive training. Unlike most NN-based compressors, DZip does not require additional training data and is not restricted to specific data types. The proposed compressor outperforms general-purpose compressors such as Gzip (29% size reduction on average) and 7zip (12% size reduction on average) on a variety of real datasets, achieves near-optimal compression on synthetic datasets, and performs close to specialized compressors for large sequence lengths, without any human input. While the main limitation of NN-based compressors is generally the encoding/decoding speed, we empirically demonstrate that DZip achieves comparable compression ratio to other NN-based compressors while being several times faster. The source code for DZip and links to the datasets are available at https : //github . com/mohit1997/Dzip-torch.
引用
收藏
页码:153 / 162
页数:10
相关论文
共 50 条
  • [41] GPGCN: A General-Purpose Graph Convolution Neural Network Accelerator Based on RISC-V ISA Extension
    Tang, Wenkai
    Zhang, Peiyong
    ELECTRONICS, 2022, 11 (22)
  • [42] STATISTICAL ISSUES IN A GENERAL-PURPOSE SIMULATION MODELING LANGUAGE
    ROBERTS, SD
    KLEIN, RW
    1989 WINTER SIMULATION CONFERENCE PROCEEDINGS, 1989, : 325 - 333
  • [43] VISUALIZATION AND MODELING OF STEREOIMAGES ON THE BASIS OF A GENERAL-PURPOSE COMPUTER
    PANOV, YA
    SOVIET JOURNAL OF OPTICAL TECHNOLOGY, 1991, 58 (04): : 257 - 258
  • [44] A GENERAL-PURPOSE GRAPH DYNAMICAL SYSTEM MODELING FRAMEWORK
    Kuhlman, Chris J.
    Kumar, V. S. Anil
    Marathe, Madhav V.
    Mortveit, Henning S.
    Swarup, Samarth
    Tuli, Gaurav
    Ravi, S. S.
    Rosenkrantz, Daniel J.
    PROCEEDINGS OF THE 2011 WINTER SIMULATION CONFERENCE (WSC), 2011, : 296 - 308
  • [45] GENERAL-PURPOSE SIMULATION TECHNIQUE FOR MODELING MILITARY OPERATIONS
    KORNBLUH, M
    MERIKALL.RA
    OHARA, JE
    OPERATIONS RESEARCH, 1964, 12 : B42 - &
  • [46] Design of FPGA Based General Purpose Neural Network
    Deotale, Prashant D.
    Dole, Lalit
    2014 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2014,
  • [47] Insight Maker: A general-purpose tool for web-based modeling & simulation
    Fortmann-Roe, Scott
    SIMULATION MODELLING PRACTICE AND THEORY, 2014, 47 : 28 - 45
  • [48] Model Compression for Data Compression: Neural Network Based Lossless Compressor Made Practical
    Qin, Liang
    Sun, Jie
    2023 DATA COMPRESSION CONFERENCE, DCC, 2023, : 52 - 61
  • [49] A prediction-based neural network scheme for lossless data compression
    Logeswaran, R
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2002, 32 (04): : 358 - 365
  • [50] Lossless data compression with neural network based on maximum entropy theory
    Fu, Yan
    Zhou, Jun-Lin
    Wu, Yue
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2007, 36 (06): : 1245 - 1248