MASC: A Bitmap Index Encoding Algorithm for Fast Data Retrieval

被引:0
|
作者
Wen, Yuhao [1 ]
Wang, Han [2 ]
Chen, Zhen [3 ,4 ]
Cao, Junwei [3 ]
Peng, Guodong [5 ]
Huang, Wen-Liang [6 ]
Hu, Ziwei [7 ]
Zhou, Jing [7 ]
Guo, Jinghong [7 ]
机构
[1] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
[2] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
[3] Tsinghua Univ, Res Inst Informat Technol, Beijing, Peoples R China
[4] Tsinghua Univ, iCtr, Beijing, Peoples R China
[5] Tsinghua Univ, Dept Mech Engn, Beijing, Peoples R China
[6] China Unicom Grp Labs, Beijing, Peoples R China
[7] State Grid Smart Grid Res Inst, Beijing, Peoples R China
关键词
traffic archival; network forensic; network security; bitmap index encoding; bitmap index compression; PLWAH; COMPAX;
D O I
10.1109/ICC.2016.7510827
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The fast retrieval in archival traffic data is essential for network security and forensic analysis. A bitmap index is a data structure enabling fast search over large data collections in a limited time, but the space consumption is always a problem. WAH, PLWAH and COMPAX are proposed for compressing bitmap indexes for less storage. In this paper, a new bitmap index encoding scheme, named MASC, is proposed to further improve the compression ratio without impairing the query performance. Instead of being limited to a fixed length (31 bits) in PLWAH and COMPAX, the stride size can be as long as possible to encode consecutive zero bits and nonzero bits in a more compact way. Instead of piggyback used in PLWAH, a new structure in MASC called carrier is introduced as piggyback in PLWAH only carries an individual nonzero bit. We also generalize the traditional literal word concept in PLWAH and COMPAX. The validity of MASC encoding scheme is demonstrated with the application in Internet Traffic Archival system. Based on experiments with real Internet traffic data set from CAIDA, MASC has a better compression ratio than PLWAH and COMPAX2 without the penalty in query performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] BAH: A Bitmap Index Compression Algorithm for Fast Data Retrieval
    Li, Chenxing
    Chen, Zhen
    Zheng, Wenxun
    Wu, Yinjun
    Cao, Junwei
    [J]. 2016 IEEE 41ST CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN), 2016, : 697 - 705
  • [2] BreadZip: a Combination of Network Traffic Data and Bitmap Index Encoding Algorithm
    Ma, Ge
    Guo, Zhenhua
    Li, Xiu
    Chen, Zhen
    Cao, Junwei
    Jiang, Yixin
    Guo, Xiaobin
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 3235 - 3240
  • [3] Bitmap index based on dimension hierarchical encoding in data warehouse
    Hu, Kongfa
    Dong, Yisheng
    Chen, Ling
    [J]. Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2005, 35 (02): : 171 - 177
  • [4] CAMP: A New Bitmap Index for Data Retrieval in Traffic Archivaln
    Wu, Yinjun
    Chen, Zhen
    Cao, Junwei
    Li, Haoxun
    Li, Chenxing
    Wang, Yijie
    Zheng, Wenxun
    Chang, Jiahui
    Zhou, Jing
    Hu, Ziwei
    Guo, Jinghong
    [J]. IEEE COMMUNICATIONS LETTERS, 2016, 20 (06) : 1128 - 1131
  • [5] A Survey on Bitmap Index Technologies for Large-scale Data Retrieval
    Mei, Ying
    Ji, Kaifan
    Wang, Feng
    [J]. 2013 6TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKS AND INTELLIGENT SYSTEMS (ICINIS), 2013, : 316 - 319
  • [6] COMBAT: A New Bitmap Index Coding Algorithm for Big Data
    Yinjun Wu
    Zhen Chen
    Yuhao Wen
    Wenxun Zheng
    Junwei Cao
    [J]. Tsinghua Science and Technology, 2016, 21 (02) : 136 - 145
  • [7] COMBAT: A New Bitmap Index Coding Algorithm for Big Data
    Wu, Yinjun
    Chen, Zhen
    Wen, Yuhao
    Zheng, Wenxun
    Cao, Junwei
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2016, 21 (02) : 136 - 145
  • [8] SECOMPAX: A bitmap index compression algorithm
    Wen, Yuhao
    Chen, Zhen
    Ma, Ge
    Cao, Junwei
    Zheng, Wenxun
    Peng, Guodong
    Li, Shiwei
    Huang, Wen-Liang
    [J]. 2014 23RD INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN), 2014,
  • [9] Optimizing bitmap index encoding for high performance queries
    Yildiz, Beytullah
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (18):
  • [10] A bitmap index for multidimensional data cubes
    Lim, Y
    Kim, M
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 349 - 358