Data deduplication mechanism for cloud storage systems

被引:6
|
作者
Xu, Xiaolong [1 ]
Tu, Qun [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Comp, Nanjing, Jiangsu, Peoples R China
关键词
cloud storage; deduplication; DelayDedupe; replica;
D O I
10.1109/CyberC.2015.71
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud storage systems are able to provide low-cost and convenient network storage service for users, which makes them more and more popular. However, the storage pressure on cloud storage system caused by the explosive growth of data is growing by the day, especially a vast amount of redundant data waste plenty of storage space. Data deduplication can effectively reduce the size of data by eliminating redundant data in storage systems. However, current researches on data deduplication, which mainly focus on the static scenes such as the backup and archive systems, are not suitable for cloud storage system due to the dynamic nature of data. In this paper, we propose the architecture of deduplication system for cloud storage environment and give the process of avoiding duplication at the file-level and chunk-level on the client side. In the storage nodes (Snodes), DelayDedupe, a delayed target-deduplication scheme based on the chunk-level deduplication and the access frequency of chunks, are proposed to reduce the response time. Combined with replica management, this method determines whether new duplicated chunks for data modification are hot and removes the hot duplicated chucks when they aren't hot. The experiment results demonstrate that the DelayDedupe mechanism can effectively reduce the response time and achieve the storage load of Snodes more balanced.
引用
收藏
页码:286 / 294
页数:9
相关论文
共 50 条
  • [41] Security-Aware and Efficient Data Deduplication for Edge-Assisted Cloud Storage Systems
    Xie, Qingyuan
    Zhang, Chen
    Jia, Xiaohua
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (03) : 2191 - 2202
  • [42] Offline Selective Data Deduplication for Primary Storage Systems
    Park, Sejin
    Park, Chanik
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (02): : 370 - 382
  • [43] Secure Data Deduplication With Dynamic Access Control for Mobile Cloud Storage
    Qi, Saiyu
    Wei, Wei
    Wang, Jianfeng
    Sun, Shifeng
    Rutkowski, Leszek
    Huang, Tingwen
    Kacprzyk, Janusz
    Qi, Yong
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (04) : 2566 - 2582
  • [44] Comments on "Privacy Aware Data Deduplication for Side Channel in Cloud Storage"
    Tang, Xin
    Zhu, Yudan
    Fu, Mingjun
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2024, 12 (02) : 814 - 817
  • [45] Weight Based Deduplication for Minimizing Data Replication in Public Cloud Storage
    Pugazhendi, E.
    Sumalatha, M. R.
    Harika, Lakshmi P.
    JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2021, 80 (03): : 260 - 269
  • [46] Fast Variable-Grained Resemblance Data Deduplication For Cloud Storage
    Ye, Xuming
    Tang, Jia
    Tian, Wenlong
    Li, Ruixuan
    Xiao, Weijun
    Geng, Yuqing
    Xu, Zhiyong
    2021 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2021, : 275 - 282
  • [47] Secure Image Deduplication in Cloud Storage
    Gang, Han
    Yan, Hongyang
    Xu, Lingling
    INFORMATION AND COMMUNICATION TECHNOLOGY, 2015, 9357 : 243 - 251
  • [48] Secure Deduplication on Public Cloud Storage
    Graupner, Hendrik
    Torkura, Kennedy A.
    Sukmana, Muhammad I. H.
    Meinel, Christoph
    ICBDC 2019: PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON BIG DATA AND COMPUTING, 2019, : 34 - 41
  • [49] Fine-grained Data Deduplication and proof of storage Scheme in Public Cloud Storage
    Gajera, Hardik
    Das, Manik Lal
    2021 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2021, : 237 - 241
  • [50] A similarity clustering-based deduplication strategy in cloud storage systems
    Long, Saiqin
    Li, Zhetao
    Liu, Zihao
    Deng, Qingyong
    Oh, Sangyoon
    Komuro, Nobuyoshi
    2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2020, : 35 - 43