Content-Based Chunk Placement Scheme for Decentralized Deduplication on Distributed File Systems

被引:0
|
作者
Kim, Keonwoo [1 ]
Kim, Jeehong [1 ]
Min, Changwoo [1 ]
Eom, Young Ik [1 ]
机构
[1] Sungkyunkwan Univ, Coll Informat & Commun Engn, Suwon, South Korea
关键词
Deduplication; Distributed file system; Chunk placement; Consistent hashing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid growth of data size causes several problems such as storage limitation and increment of data management cost. In order to store and manage massive data, Distributed File System (DFS) is widely used. Furthermore, in order to reduce the volume of storage, data deduplication schemes are being extensively studied. The data deduplication increases the available storage capacity by eliminating duplicated data. However, deduplication process causes performance overhead such as disk I/O. In this paper, we propose a content-based chunk placement scheme to increase deduplication rate on the DFS. To avoid performance overhead caused by deduplication process, we use lessfs in each chunk server. With our design, our system performs decentralized deduplication process in each chunk server. Moreover, we use consistent hashing for chunk allocation and failure recovery. Our experimental results show that the proposed system reduces the storage space by 60% than the system without consistent hashing.
引用
收藏
页码:173 / 183
页数:11
相关论文
共 50 条
  • [21] A framework for distributed content-based web services notification in Grid systems
    Quiroz, Andres
    Parashar, Manish
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (05): : 452 - 459
  • [22] A decentralized content-based aggregation service for pervasive environments
    Jiang, Nanyan
    Schmidt, Cristina
    Parashar, Manish
    [J]. INTERNATIONAL CONFERENCE ON PERVASIVE SERVICES, PROCEEDINGS, 2006, : 203 - +
  • [23] Distributed Autonomous Neuro-Gen Learning Engine for Content-Based Document File Type Identification
    Aaron
    Sitompul, Opim Salim
    Rahmat, Romi Fadillah
    [J]. 2014 INTERNATIONAL CONFERENCE ON CYBER AND IT SERVICE MANAGEMENT (CITSM), 2014, : 63 - 68
  • [24] Improving restore speed for backup systems that use inline chunk-based deduplication
    Lillibridge, Mark
    Eshghi, Kave
    Bhagwat, Deepavali
    [J]. HP Laboratories Technical Report, 2013, (41):
  • [25] Blockchain-Based Secure and Reliable Distributed Deduplication Scheme
    Li, Jingyi
    Wu, Jigang
    Chen, Long
    Li, Jiaxing
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2018, PT I, 2018, 11334 : 393 - 405
  • [26] Even Data Placement for Load Balance in Reliable Distributed Deduplication Storage Systems
    Xu, Min
    Zhu, Yunfeng
    Lee, Patrick P. C.
    Xu, Yinlong
    [J]. 2015 IEEE 23RD INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2015, : 349 - 358
  • [27] Modeling chunk-based content placement in information centric networking
    Wang Guoqing
    Liu Jiang
    Li Xiuqin
    Yang Shaoyu
    Li Guojia
    [J]. The Journal of China Universities of Posts and Telecommunications., 2017, 24 (03) - 50
  • [28] Modeling chunk-based content placement in information centric networking
    Wang Guoqing
    Liu Jiang
    Li Xiuqin
    Yang Shaoyu
    Li Guojia
    [J]. The Journal of China Universities of Posts and Telecommunications, 2017, (03) : 44 - 50
  • [29] LDPP: A Learned Directory Placement Policy in Distributed File Systems
    Wang, Yuanzhang
    Yang, Fengkui
    Zhang, Ji
    Zhou, Ke
    Li, Chunhua
    Liu, Chong
    Cheng, Zhuo
    Fang, Wei
    Liu, Jinhu
    [J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [30] Optimization of Content Placement Scheme for Social Media on Distributed Content Clouds
    Zhang, Qian
    Li, Runzhi
    Lin, Yusong
    Wang, Zongmin
    [J]. PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON MULTIMEDIA TECHNOLOGY (ICMT-13), 2013, 84 : 1521 - 1528