Improving Storage Capacity by Distributed Exact Deduplication Systems

被引:0
|
作者
Barca, Cristian [1 ]
Barca, Dan Claudiu [1 ]
Mara, Constantin [1 ]
Anghelescu, Petre [1 ]
Gavriloaia, Bogdan [2 ]
Vizireanu, Radu [2 ]
Craciunescu, Razvan [2 ]
Fratu, Octavian [2 ]
机构
[1] Univ Pitesti, Fac Elect Commun & Comp, Pitesti, Romania
[2] Univ Politehn Bucuresti, Telecommun Dept, Bucharest, Romania
关键词
distributed exact deduplication; in-line deduplication; hashed fingerprints; indexing; load balancing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The topic of data deduplication has received lately a lot of attention for its storage reduction functionality. Data deduplication essentially refers to the elimination of redundant data, leaving only one copy of the data to be stored, and is meant to reduce the pain regarding the exponential data growth in backup or archiving centers. Most existing state-of-the-art deduplication systems rely on approximate deduplication in order to achieve high-performance. Unfortunately, these studies are usually conducted and tested on single-host systems. Although their authors claim that the design can be easily applied on multinode systems, we have not seen yet an extension that enacts that - they lack of trust. Thus, in a world where data deduplication storage systems are continuously struggling in providing the required throughput and disk capacities necessary to store and retrieve data within reasonable times, we are handled the task to design a distributed deduplication systems that will achieve efficiency, scalability and throughput at a petascale capacity level. In this paper we present a proof-of-concept design that one can use to implement such a system: A Distributed Exact Deduplication System, which we believe it will cross the boundaries towards a new generation of backup and archiving systems.
引用
收藏
页码:C11 / C16
页数:6
相关论文
共 50 条
  • [31] Data deduplication mechanism for cloud storage systems
    Xu, Xiaolong
    Tu, Qun
    2015 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY, 2015, : 286 - 294
  • [32] A survey on novel classification of deduplication storage systems
    Mohamed, Shawgi M. A.
    Wang, Yongli
    DISTRIBUTED AND PARALLEL DATABASES, 2021, 39 (01) : 201 - 230
  • [33] Deduplication in unstructured-data storage systems
    Tolic, Andrej
    Brodnik, Andrej
    ELEKTROTEHNISKI VESTNIK-ELECTROCHEMICAL REVIEW, 2015, 82 (05): : 233 - 242
  • [34] A survey on novel classification of deduplication storage systems
    Shawgi M. A. Mohamed
    Yongli Wang
    Distributed and Parallel Databases, 2021, 39 : 201 - 230
  • [35] Lazy Exact Deduplication
    Ma, Jingwei
    Stones, Rebecca J.
    Ma, Yuxiang
    Wang, Jingui
    Ren, Junjie
    Wang, Gang
    Liu, Xiaoguang
    2016 32ND SYMPOSIUM ON MASS STORAGE SYSTEMS AND TECHNOLOGIES (MSST), 2016,
  • [36] Lazy Exact Deduplication
    Ma, Jingwei
    Stones, Rebecca J.
    Ma, Yuxiang
    Wang, Jingui
    Ren, Junjie
    Wang, Gang
    Liu, Xiaoguang
    ACM TRANSACTIONS ON STORAGE, 2017, 13 (02)
  • [37] A Study on Data Deduplication in HPC Storage Systems
    Meister, Dirk
    Kaiser, Juergen
    Brinkmann, Andre
    Cortes, Toni
    Kuhn, Michael
    Kunkel, Julian
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [38] Secure Deduplication Storage Systems with Keyword Search
    Li, Jin
    Chen, Xiaofeng
    Xhafa, Fatos
    Barolli, Leonard
    2014 IEEE 28TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2014, : 971 - 977
  • [39] New Codes and Inner Bounds for Exact Repair in Distributed Storage Systems
    Goparaju, Sreechakra
    El Rouayheb, Salim
    Calderbank, Robert
    2014 48TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2014,
  • [40] Exact-Repair Codes With Partial Collaboration in Distributed Storage Systems
    Liu, Shiqiu
    Shum, Kenneth W.
    Li, Congduan
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (07) : 4012 - 4021