A general analytical model for spatial and temporal performance of bitmap index compression algorithms in Big Data

被引:0
|
作者
Wu, Yinjun [1 ]
Chen, Zhen [1 ]
Wen, Yuhao [1 ]
Cao, Junwei [1 ]
Zheng, Wenxun [1 ]
Ma, Ge [1 ]
机构
[1] Tsinghua Univ, Res Inst Informat Technol, Tsinghua Natl Lab Informat Sci & Technol TNList, Beijing, Peoples R China
关键词
bitmap index; Big Data; COMBAT; SECOMPAX; CONCISE; data compression; performance evaluation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Bitmap indexing is flexible to conduct boolean operations in data retrieval. Besides, the query processing based on bitmap indexing is also very fast. Therefore it has been widely used in various big data analytics platforms, such as Druid and Spark etc. However, bitmap index can consume a large amount of memory, which leads to the invention of different kinds of bitmap index compression algorithms without sacrificing temporal performance. In practice, we are often discommoded by choosing a proper algorithm when handling specific problems. Besides, after devising a new algorithm that may outperform existing ones, it is essential to evaluate its performance in theory. Without appropriate theoretical analysis, the deficit of a new algorithm can only be spotted until final experimental results are drawn, thus wasting much time and effort. In this paper, we propose a general analytical model to analyze both the spatial and temporal performance for bitmap index compression algorithms, which can be applied to analyze all kinds of algorithms derived from WAH (word-aligned hybrid). In this model, two types of distributed bitmaps, uniformly distributed bitmaps and clustered bitmaps, are used separately. In order to illustrate this model, several bitmap index compression algorithms are analyzed and compared with each other. Algorithms herein are COMBAT (COMbining Binary And Ternary encoding), SECOMPAX (Scope Extended COMPAX) and CONCISE (Compressed 'n' Composable Integer Set), which are all derived from WAH. Evaluation results by MATLAB simulation about these algorithms are also presented. This paper paves the way for further researches on the performance evaluation of various bitmap index compression algorithms in the future.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] A Novel Temporal-spatial Analysis System for QAR Big Data
    Sun, Huabo
    Jiao, Yang
    Han, Jingru
    Wang, Chun
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1238 - 1241
  • [42] An analytical model for performance evaluation of handover decision algorithms
    Cai, Xuejun
    Chi, Caixia
    [J]. 2007 SECOND INTERNATIONAL CONFERENCE IN COMMUNICATIONS AND NETWORKING IN CHINA, VOLS 1 AND 2, 2007, : 165 - 169
  • [43] A model for the spatial and temporal integration of geolocated data
    Arenas, Helbert
    Trojahn, Cassia
    Comparot, Catherine
    Aussenac-Gilles, Nathalie
    [J]. REVUE INTERNATIONALE DE GEOMATIQUE, 2018, 28 (02): : 243 - 266
  • [44] ANALYTICAL AND COMPUTATION IMPROVEMENTS IN PERFORMANCE-INDEX RANKING ALGORITHMS FOR NETWORKS
    STOTT, B
    ALSAC, O
    ALVARADO, FL
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 1985, 7 (03) : 154 - 160
  • [45] A spatial data model for urban spatial–temporal accessibility analysis
    Zhangcai Yin
    Zhanghaonan Jin
    Shen Ying
    Sanjuan Li
    Qingquan Liu
    [J]. Journal of Geographical Systems, 2020, 22 : 447 - 468
  • [46] General Identity Management Model for Big Data Analysis
    Gao, Feng
    Zhang, Feng
    Xia, Junjie
    Ma, Zheng
    [J]. 2016 18TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATIONS TECHNOLOGY (ICACT) - INFORMATION AND COMMUNICATIONS FOR SAFE AND SECURE LIFE, 2016, : 197 - 200
  • [47] Big data and natural disasters: New approaches for spatial and temporal massive data analysis
    Martinez-Alvarez, F.
    Morales-Esteban, A.
    [J]. COMPUTERS & GEOSCIENCES, 2019, 129 : 38 - 39
  • [48] Performance Evaluation of Data-driven Intelligent Algorithms for Big data Ecosystem
    Junaid, Muhammad
    Ali, Sajid
    Siddiqui, Isma Farah
    Nam, Choonsung
    Qureshi, Nawab Muhammad Faseeh
    Kim, Jaehyoun
    Shin, Dong Ryeol
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2022, 126 (03) : 2403 - 2423
  • [49] Performance Evaluation of Data-driven Intelligent Algorithms for Big data Ecosystem
    Muhammad Junaid
    Sajid Ali
    Isma Farah Siddiqui
    Choonsung Nam
    Nawab Muhammad Faseeh Qureshi
    Jaehyoun Kim
    Dong Ryeol Shin
    [J]. Wireless Personal Communications, 2022, 126 : 2403 - 2423
  • [50] On the performance of data compression algorithms based upon string matching
    Yang, EH
    Kieffer, JC
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (01) : 47 - 65