Dynamic Virtual Chunks: On Supporting Efficient Accesses to Compressed Scientific Data

被引:5
|
作者
Zhao, Dongfang [1 ]
Qiao, Kan [1 ]
Yin, Jian [2 ]
Raicu, Ioan [1 ,3 ]
机构
[1] IIT, Dept Comp Sci, Chicago, IL 60616 USA
[2] Pacific NW Natl Lab, Div Math & Comp Sci, Richland, WA 99354 USA
[3] Argonne Natl Lab, Div Comp Sci, 9700 S Cass Ave, Argonne, IL 60439 USA
基金
美国国家科学基金会;
关键词
File compression; distributed file systems; parallel file systems; big data; data-intensive computing; scientific computing;
D O I
10.1109/TSC.2015.2456889
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data compression could ameliorate the I/O pressure of data-intensive scientific applications. Unfortunately, the conventional wisdom of naively applying data compression to the file or block brings the dilemma between efficient random accesses and high compression ratios. File-level compression barely supports efficient random accesses to the compressed data: any retrieval request need trigger the decompression from the beginning of the compressed file. Block-level compression provides flexible random accesses to the compressed blocks, but introduces extra overhead when applying the compressor to each and every block that results in a degraded overall compression ratio. This paper extends our prior work that introduces virtual chunks offering efficient random accesses to the compressed scientific data without sacrificing the compression ratio. Virtual chunks are logical blocks pointed at by appended references without breaking the physical continuity of the file content. These references allow the decompression to start from an arbitrary position (efficient random accesses), while no per-block overhead is introduced because the file's physical entirety is retained (high compression ratio). One limitation of virtual chunk is it only supports static references. This paper presents the algorithms, analysis, and evaluations of dynamic virtual chunks to deal with the cases where the references are updated dynamically.
引用
收藏
页码:96 / 109
页数:14
相关论文
共 50 条
  • [1] Virtual Chunks: On Supporting Random Accesses to Scientific Data in Compressible Storage Systems
    Zhao, Dongfang
    Yin, Jian
    Qiao, Kan
    Raicu, Ioan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 231 - 240
  • [2] Efficient method to verify the integrity of data with supporting dynamic data in cloud computing
    Cheng Guo
    Xinyu Tang
    Yingmo Jie
    Bin Feng
    [J]. Science China Information Sciences, 2018, 61
  • [3] Efficient method to verify the integrity of data with supporting dynamic data in cloud computing
    Cheng GUO
    Xinyu TANG
    Yingmo JIE
    Bin FENG
    [J]. Science China(Information Sciences), 2018, 61 (11) : 235 - 237
  • [4] Efficient method to verify the integrity of data with supporting dynamic data in cloud computing
    Guo, Cheng
    Tang, Xinyu
    Jie, Yingmo
    Feng, Bin
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2018, 61 (11)
  • [5] Secure and efficient cloud data deduplication supporting dynamic data public auditing
    Ma, Hua
    Han, Xiaoyu
    Peng, Ting
    Zhang, Linchao
    [J]. International Journal of Network Security, 2018, 20 (06) : 1074 - 1084
  • [6] Virtual agents as supporting media for scientific presentations
    Bickmore, Timothy
    Kimani, Everlyne
    Shamekhi, Ameneh
    Murali, Prasanth
    Parmar, Dhaval
    Trinh, Ha
    [J]. JOURNAL ON MULTIMODAL USER INTERFACES, 2021, 15 (02) : 131 - 146
  • [7] Virtual agents as supporting media for scientific presentations
    Timothy Bickmore
    Everlyne Kimani
    Ameneh Shamekhi
    Prasanth Murali
    Dhaval Parmar
    Ha Trinh
    [J]. Journal on Multimodal User Interfaces, 2021, 15 : 131 - 146
  • [8] Efficient cache invalidation schemes for mobile data accesses
    Chuang, Po-Jen
    Chiu, Yu-Shian
    [J]. INFORMATION SCIENCES, 2011, 181 (22) : 5084 - 5101
  • [9] Supporting Efficient Dynamic Update in Public Integrity Verification of Cloud Data
    Wan, Jiawei
    Jia, Shijie
    Liu, Limin
    Zhang, Yang
    [J]. 2020 IEEE 39TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2020,
  • [10] Efficient Dynamic Integrity Verification for Big Data Supporting Users Revocability
    Zhang, Xinpeng
    Xu, Chunxiang
    Zhang, Xiaojun
    Gu, Taizong
    Geng, Zhi
    Liu, Guoping
    [J]. INFORMATION, 2016, 7 (02)