Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud

被引:51
|
作者
Mao, Bo [1 ]
Jiang, Hong [2 ]
Wu, Suzhen [1 ]
Fu, Yinjin [3 ]
Tian, Lei [2 ]
机构
[1] Xiamen Univ, Xiamen 361005, Peoples R China
[2] Univ Nebraska, Lincoln, NE USA
[3] Natl Univ Def Technol, Changsha 410073, Hunan, Peoples R China
基金
美国国家科学基金会;
关键词
Storage systems; data deduplication; virtual machine; solid-state drive; read performance; Design; Performance;
D O I
10.1145/2512348
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM ( virtual machine) platforms. However, the performance of restore operations from a deduplicated backup can be significantly lower than that without deduplication. The main reason lies in the fact that a file or block is split into multiple small data chunks that are often located in different disks after deduplication, which can cause a subsequent read operation to invoke many disk IOs involving multiple disks and thus degrade the read performance significantly. While this problem has been by and large ignored in the literature thus far, we argue that the time is ripe for us to pay significant attention to it in light of the emerging cloud storage applications and the increasing popularity of the VM platform in the cloud. This is because, in a cloud storage or VM environment, a simple read request on the client side may translate into a restore operation if the data to be read or a VM suspended by the user was previously deduplicated when written to the cloud or the VM storage server, a likely scenario considering the network bandwidth and storage capacity concerns in such an environment. To address this problem, in this article, we propose SAR, an SSD (solid-state drive)-Assisted Read scheme, that effectively exploits the high random-read performance properties of SSDs and the unique data-sharing characteristic of deduplication-based storage systems by storing in SSDs the unique data chunks with high reference count, small size, and nonsequential characteristics. In this way, many read requests to HDDs are replaced by read requests to SSDs, thus significantly improving the read performance of the deduplicationbased storage systems in the cloud. The extensive trace-driven and VM restore evaluations on the prototype implementation of SAR show that SAR outperforms the traditional deduplication-based and flash-based cache schemes significantly, in terms of the average response times.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] SRSC: Improving Restore Performance for Deduplication-Based Storage Systems
    ZUO Chunxue
    WANG Fang
    TANG Xiaolan
    ZHANG Yucheng
    FENG Dan
    [J]. ZTE Communications, 2019, 17 (02) : 59 - 66
  • [2] Deduplication-Based Energy Efficient Storage System in Cloud Environment
    Li, He
    Dong, Mianxiong
    Liao, Xiaofei
    Jin, Hai
    [J]. COMPUTER JOURNAL, 2015, 58 (06): : 1373 - 1383
  • [3] A Novel Deduplication-Based Covert Channel in Cloud Storage Service
    Hovhannisyan, Hermine
    Lu, Kejie
    Yang, Rongwei
    Qi, Wen
    Wang, Jianping
    Wen, Mi
    [J]. 2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [4] DCStore: A Deduplication-Based Cloud-of-Clouds Storage Service
    An, Bo
    Li, Yan
    Ma, Junming
    Huang, Gang
    Chen, Xiangqun
    Cao, Donggang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES (IEEE ICWS 2019), 2019, : 291 - 295
  • [5] Exploiting the Data Redundancy Locality to Improve the Performance of Deduplication-based Storage Systems
    Wu, Suzhen
    Chen, Xiao
    Mao, Bo
    [J]. 2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 527 - 534
  • [6] DedupHR: Exploiting Content Locality to Alleviate Read/Write Interference in Deduplication-Based Flash Storage
    Wu, Suzhen
    Du, Chunfeng
    Zhang, Weiwei
    Mao, Bo
    Jiang, Hong
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (06) : 1332 - 1343
  • [7] Improving Reliability of Deduplication-based Storage Systems with Per-File Parity
    Wu, Suzhen
    Luan, Huagao
    Mao, Bo
    Jiang, Hong
    Niu, Gen
    Rao, Hui
    Yu, Fang
    Zhou, Jindong
    [J]. 2018 IEEE 37TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), 2018, : 171 - 180
  • [8] Whispers in the cloud storage: A novel cross-user deduplication-based covert channel design
    Hermine Hovhannisyan
    Wen Qi
    Kejie Lu
    Rongwei Yang
    Jianping Wang
    [J]. Peer-to-Peer Networking and Applications, 2018, 11 : 277 - 286
  • [9] Whispers in the cloud storage: A novel cross-user deduplication-based covert channel design
    Hovhannisyan, Hermine
    Qi, Wen
    Lu, Kejie
    Yang, Rongwei
    Wang, Jianping
    [J]. PEER-TO-PEER NETWORKING AND APPLICATIONS, 2018, 11 (02) : 277 - 286
  • [10] What If Keys Are Leaked? towards Practical and Secure Re-Encryption in Deduplication-Based Cloud Storage
    You, Weijing
    Lei, Lei
    Chen, Bo
    Liu, Limin
    [J]. INFORMATION, 2021, 12 (04)