VParC: A Compression Scheme for Numeric Data in Column-Oriented Databases

被引:0
|
作者
Yan, Ke [1 ]
Zhu, Hong [1 ]
Lu, Kevin [2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Brunel Univ, Coll Business Arts & Social Sci, Uxbridge UB8 3PH, Middx, England
关键词
Column-stores; data management; compression; query processing; analytical workload;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compression is one of the most important techniques in data management, which is usually used to improve the query efficiency in database. However, there are some restrictions on existing compression algorithms that have been applied to numeric data in column-oriented databases. First, a compression algorithm is suitable only for columns with certain data distributions not for all kinds of data columns; second, a data column with irregular distribution is hard to be compressed; third, the data column compressed by using heavyweight methods cannot be operated before decompression which leads to inefficient query. Based on the fact that it is more possible for a column to have sub-regularity than have global-regularity, we developed a compression scheme called Vertically Partitioning Compression (VParC). This method is suitable for columns with different data distributions, even for irregular columns in some cases. The more important thing is that data compressed by VParC can be operated directly without decompression in advance. Details of the compression and query evaluation approaches are presented in this paper and the results of our experiments demonstrate the promising features of VParC.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [1] Column-oriented databases
    Bößwetter D.
    Puppe F.
    Steinbauer D.
    [J]. Informatik-Spektrum, 2010, 33 (01) : 61 - 65
  • [2] COMPRESSION OF TEXTUAL COLUMN-ORIENTED DATA
    Garcia, Vinicius Fulber
    Sardi Mergen, Sergio Luis
    [J]. COMPUTING AND INFORMATICS, 2018, 37 (02) : 405 - 423
  • [3] Data Integrity Verification in Column-Oriented NoSQL Databases
    Weintraub, Grisha
    Gudes, Ehud
    [J]. DATA AND APPLICATIONS SECURITY AND PRIVACY XXXII, DBSEC 2018, 2018, 10980 : 165 - 181
  • [4] Impact of Data Compression on the Performance of Column-oriented Data Stores
    Mladenova, Tsvetelina
    Kalmukov, Yordan
    Marinov, Milko
    Valova, Irena
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (07) : 416 - 421
  • [5] Logical Schema for Data Warehouse on Column-Oriented NoSQL Databases
    Boussahoua, Mohamed
    Boussaid, Omar
    Bentayeb, Fadila
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT II, 2017, 10439 : 247 - 256
  • [6] Fixed-length String Compression for Direct Operations in Column-oriented Databases
    KeYan
    Xie, Meiyi
    Zhu, Hong
    [J]. 2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2013, : 1171 - 1176
  • [7] Parallel Processing of Sensor Network Data using Column-Oriented Databases
    Kim, Kyung-Chang
    Kim, Choung-Seok
    [J]. 2013 AASRI CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, 2013, 5 : 2 - 8
  • [8] Implementation of Multidimensional Databases in Column-Oriented NoSQL Systems
    Chevalier, Max
    El Malki, Mohammed
    Kopliku, Arlind
    Teste, Olivier
    Tournier, Ronan
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2015, 2015, 9282 : 79 - 91
  • [9] Toward Automatic Generation of Column-Oriented NoSQL Databases in Big Data Context
    Esbai, Redouane
    Elotmani, Fouad
    Zahra Belkadi, Fatima
    [J]. INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2019, 15 (09) : 4 - 16
  • [10] An Efficient Schema Transformation Technique for Data Migration from Relational to Column-Oriented Databases
    Zaidi, Norwini
    Ishak, Iskandar
    Sidi, Fatimah
    Affendey, Lilly Suriani
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2022, 43 (03): : 1175 - 1188