How Fast Can One Scale Down a Distributed File System?

被引:0
|
作者
Cheriere, Nathanael [1 ]
Antoniu, Gabriel [2 ]
机构
[1] ENS Rennes, IRISA, Rennes, France
[2] INRIA, Rennes Bretagne Atlantique Res Ctr, Rennes, France
关键词
Elastic Storage; Distributed File System; Malleable File System; Model; Decommission;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient Big Data processing, efficient resource utilization becomes a major concern as large-scale computing infrastructures such as supercomputers or clouds keep growing in size. Naturally, energy and cost savings can be obtained by reducing idle resources. Malleability, which is the possibility for resource managers to dynamically increase or reduce the resources of jobs, appears as a promising means to progress towards this goal. However, state-of-the-art parallel and distributed file systems have not been designed with malleability in mind. This is mainly due to the supposedly high cost of storage decommission, which is considered to involve expensive data transfers. Nevertheless, as network and storage technologies evolve, old assumptions on potential bottlenecks can be revisited. In this study, we evaluate the viability of malleability as a design principle for a distributed file system. We specifically model the duration of the decommission operation, for which we obtain a theoretical lower bound. Then we consider HDFS as a use case and we show that our model can explain the measured decommission times. The existing decommission mechanism of HDFS is good when the network is the bottleneck, but could be accelerated by up to a factor 3 when the storage is the limiting factor. With the highlights provided by our model, we suggest improvements to speed up decommission in HDFS and we discuss open perspectives for the design of efficient malleable distributed file systems.
引用
收藏
页码:141 / 150
页数:10
相关论文
共 50 条
  • [21] How fast can one compute the permanent of circulant matrices?
    Bernasconi, A
    Codenotti, B
    Crespi, V
    Resta, G
    LINEAR ALGEBRA AND ITS APPLICATIONS, 1999, 292 (1-3) : 15 - 37
  • [22] HOW ACCURATELY CAN WE CALCULATE FAST NEUTRONS SLOWING DOWN IN WATER?
    Sublet, J-Ch.
    Cullen, D. E.
    MacFarlane, R. E.
    NUCLEAR TECHNOLOGY, 2009, 168 (02) : 293 - 297
  • [23] Can Parallel Replication Benefit Hadoop Distributed File System for High Performance Interconnects?
    Islam, Nusrat S.
    Lu, Xiaoyi
    Wasi-ur-Rahman, Md
    Panda, Dhabaleswar K.
    2013 IEEE 21ST ANNUAL SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS (HOTI), 2013, : 75 - 78
  • [24] Optimizing file availability in a secure serverless distributed file system
    Douceur, JR
    Wattenhofer, RP
    20TH IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2001, : 4 - 13
  • [25] A Distributed File System for Frequency Reading of Various File Sizes
    Ma, Pengfei
    Yin, Yanshen
    Lan, Chao
    Zhang, Yong
    Xing, Chunxiao
    2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013), 2013, : 339 - +
  • [26] JigDFS: A Secure Distributed File System
    Bian, Jiang
    Seker, Remzi
    IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN CYBER SECURITY, 2009, : 76 - 82
  • [27] The Jigsaw secure distributed file system
    Bian, Jiang
    Seker, Remzi
    COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (04) : 1142 - 1152
  • [28] DESIGN AND IMPLEMENTATION OF A DISTRIBUTED FILE SYSTEM
    CHENG, HC
    SHEU, JP
    SOFTWARE-PRACTICE & EXPERIENCE, 1991, 21 (07): : 657 - 675
  • [29] Data Structures for Storing File Namespace in Distributed File System
    Long, Luu Hoang
    Choi, Eunmi
    Kim, SangBum
    Kim, Pilsung
    NCM 2008 : 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 1, PROCEEDINGS, 2008, : 250 - 255
  • [30] The Evolution of the Hadoop Distributed File System
    Maneas, Stathis
    Schroeder, Bianca
    2018 32ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2018, : 67 - 74