Improving Storage Capacity by Distributed Exact Deduplication Systems

被引:0
|
作者
Barca, Cristian [1 ]
Barca, Dan Claudiu [1 ]
Mara, Constantin [1 ]
Anghelescu, Petre [1 ]
Gavriloaia, Bogdan [2 ]
Vizireanu, Radu [2 ]
Craciunescu, Razvan [2 ]
Fratu, Octavian [2 ]
机构
[1] Univ Pitesti, Fac Elect Commun & Comp, Pitesti, Romania
[2] Univ Politehn Bucuresti, Telecommun Dept, Bucharest, Romania
关键词
distributed exact deduplication; in-line deduplication; hashed fingerprints; indexing; load balancing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The topic of data deduplication has received lately a lot of attention for its storage reduction functionality. Data deduplication essentially refers to the elimination of redundant data, leaving only one copy of the data to be stored, and is meant to reduce the pain regarding the exponential data growth in backup or archiving centers. Most existing state-of-the-art deduplication systems rely on approximate deduplication in order to achieve high-performance. Unfortunately, these studies are usually conducted and tested on single-host systems. Although their authors claim that the design can be easily applied on multinode systems, we have not seen yet an extension that enacts that - they lack of trust. Thus, in a world where data deduplication storage systems are continuously struggling in providing the required throughput and disk capacities necessary to store and retrieve data within reasonable times, we are handled the task to design a distributed deduplication systems that will achieve efficiency, scalability and throughput at a petascale capacity level. In this paper we present a proof-of-concept design that one can use to implement such a system: A Distributed Exact Deduplication System, which we believe it will cross the boundaries towards a new generation of backup and archiving systems.
引用
收藏
页码:C11 / C16
页数:6
相关论文
共 50 条
  • [1] Distributed Exact Deduplication for Primary Storage Infrastructures
    Paulo, Joao
    Pereira, Jose
    DISTRIBUTED APPLICATIONS AND INTEROPERABLE SYSTEMS (DAIS 2014), 2014, 8460 : 52 - 66
  • [2] Genetic Optimized Data Deduplication for Distributed Big Data Storage Systems
    Kumar, Naresh
    Antwal, Shobha
    Samarthyam, Ganesh
    Jain, S. C.
    PROCEEDINGS OF 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC 2K17), 2017, : 7 - 15
  • [3] SRSC: Improving Restore Performance for Deduplication-Based Storage Systems
    ZUO Chunxue
    WANG Fang
    TANG Xiaolan
    ZHANG Yucheng
    FENG Dan
    ZTE Communications, 2019, 17 (02) : 59 - 66
  • [4] Using File-Aware Deduplication to Improve Capacity in Storage Systems
    Bartus, Paul
    Arzuaga, Emmanuel
    2017 IEEE COLOMBIAN CONFERENCE ON COMMUNICATIONS AND COMPUTING (COLCOM), 2017,
  • [5] On Secure Distributed Storage Systems with Exact Repair
    Tandon, Ravi
    Amuru, SaiDhiraj
    Clancy, T. Charles
    Buehrer, R. Michael
    2014 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2014, : 3908 - 3912
  • [6] Even Data Placement for Load Balance in Reliable Distributed Deduplication Storage Systems
    Xu, Min
    Zhu, Yunfeng
    Lee, Patrick P. C.
    Xu, Yinlong
    2015 IEEE 23RD INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2015, : 349 - 358
  • [7] When Deduplication Meets Migration: An Efficient and Adaptive Strategy in Distributed Storage Systems
    Cheng, Geyao
    Luo, Lailong
    Xia, Junxu
    Guo, Deke
    Sun, Yuchen
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (10) : 2749 - 2766
  • [8] Scalable, Efficient, and Policy-aware Deduplication for Primary Distributed Storage Systems
    Fingler, Henrique
    Ra, Moo-Ryong
    Panta, Rajesh
    2019 31ST INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2019), 2019, : 180 - 187
  • [9] FASR: An Efficient Feature-Aware Deduplication Method in Distributed Storage Systems
    Yao, Wenbin
    Hao, Mengyao
    Hou, Yingying
    Li, Xiaoyong
    IEEE ACCESS, 2022, 10 : 15311 - 15321
  • [10] Synchronization and Deduplication in Coded Distributed Storage Networks
    El Rouayheb, Salim
    Goparaju, Sreechakra
    Kiah, Han Mao
    Milenkovic, Olgica
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2016, 24 (05) : 3056 - 3069