A general analytical model for spatial and temporal performance of bitmap index compression algorithms in Big Data

被引:0
|
作者
Wu, Yinjun [1 ]
Chen, Zhen [1 ]
Wen, Yuhao [1 ]
Cao, Junwei [1 ]
Zheng, Wenxun [1 ]
Ma, Ge [1 ]
机构
[1] Tsinghua Univ, Res Inst Informat Technol, Tsinghua Natl Lab Informat Sci & Technol TNList, Beijing, Peoples R China
关键词
bitmap index; Big Data; COMBAT; SECOMPAX; CONCISE; data compression; performance evaluation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Bitmap indexing is flexible to conduct boolean operations in data retrieval. Besides, the query processing based on bitmap indexing is also very fast. Therefore it has been widely used in various big data analytics platforms, such as Druid and Spark etc. However, bitmap index can consume a large amount of memory, which leads to the invention of different kinds of bitmap index compression algorithms without sacrificing temporal performance. In practice, we are often discommoded by choosing a proper algorithm when handling specific problems. Besides, after devising a new algorithm that may outperform existing ones, it is essential to evaluate its performance in theory. Without appropriate theoretical analysis, the deficit of a new algorithm can only be spotted until final experimental results are drawn, thus wasting much time and effort. In this paper, we propose a general analytical model to analyze both the spatial and temporal performance for bitmap index compression algorithms, which can be applied to analyze all kinds of algorithms derived from WAH (word-aligned hybrid). In this model, two types of distributed bitmaps, uniformly distributed bitmaps and clustered bitmaps, are used separately. In order to illustrate this model, several bitmap index compression algorithms are analyzed and compared with each other. Algorithms herein are COMBAT (COMbining Binary And Ternary encoding), SECOMPAX (Scope Extended COMPAX) and CONCISE (Compressed 'n' Composable Integer Set), which are all derived from WAH. Evaluation results by MATLAB simulation about these algorithms are also presented. This paper paves the way for further researches on the performance evaluation of various bitmap index compression algorithms in the future.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A Survey of Bitmap Index Compression Algorithms for Big Data
    Zhen Chen
    Yuhao Wen
    Junwei Cao
    Wenxun Zheng
    Jiahui Chang
    Yinjun Wu
    Ge Ma
    Mourad Hakmaoui
    Guodong Peng
    [J]. Tsinghua Science and Technology, 2015, 20 (01) : 100 - 115
  • [2] A Survey of Bitmap Index Compression Algorithms for Big Data
    Chen, Zhen
    Wen, Yuhao
    Cao, Junwei
    Zheng, Wenxun
    Chang, Jiahui
    Wu, Yinjun
    Ma, Ge
    Hakmaoui, Mourad
    Peng, Guodong
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2015, 20 (01) : 100 - 115
  • [3] COMBAT: A New Bitmap Index Coding Algorithm for Big Data
    Yinjun Wu
    Zhen Chen
    Yuhao Wen
    Wenxun Zheng
    Junwei Cao
    [J]. Tsinghua Science and Technology, 2016, 21 (02) : 136 - 145
  • [4] COMBAT: A New Bitmap Index Coding Algorithm for Big Data
    Wu, Yinjun
    Chen, Zhen
    Wen, Yuhao
    Zheng, Wenxun
    Cao, Junwei
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2016, 21 (02) : 136 - 145
  • [5] A new bitmap index and a new data cube compression technology
    Xi, Jianqing
    Chen, Fuqiang
    Zhang, Pingjian
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2008, PT 2, PROCEEDINGS, 2008, 5073 : 1218 - 1228
  • [6] BAH: A Bitmap Index Compression Algorithm for Fast Data Retrieval
    Li, Chenxing
    Chen, Zhen
    Zheng, Wenxun
    Wu, Yinjun
    Cao, Junwei
    [J]. 2016 IEEE 41ST CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN), 2016, : 697 - 705
  • [7] Spatial-temporal difference equations and their application in spatial-temporal data model especially for big data
    Zhu, Dingju
    [J]. JOURNAL OF DIFFERENCE EQUATIONS AND APPLICATIONS, 2017, 23 (1-2) : 66 - 87
  • [8] Clustering Algorithms for Spatial Big Data
    Schoier, Gabriella
    Gregorio, Caterina
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT IV, 2017, 10407 : 571 - 583
  • [9] A hybrid index for temporal big data
    Wang, Mei
    Xiao, Meng
    Peng, Sancheng
    Liu, Guohua
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 72 : 264 - 272
  • [10] Cloud parallel spatial-temporal data model with intelligent parameter adaptation for spatial-temporal big data
    Zhu, Dingju
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (22):