How Fast Can One Scale Down a Distributed File System?

被引:0
|
作者
Cheriere, Nathanael [1 ]
Antoniu, Gabriel [2 ]
机构
[1] ENS Rennes, IRISA, Rennes, France
[2] INRIA, Rennes Bretagne Atlantique Res Ctr, Rennes, France
关键词
Elastic Storage; Distributed File System; Malleable File System; Model; Decommission;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient Big Data processing, efficient resource utilization becomes a major concern as large-scale computing infrastructures such as supercomputers or clouds keep growing in size. Naturally, energy and cost savings can be obtained by reducing idle resources. Malleability, which is the possibility for resource managers to dynamically increase or reduce the resources of jobs, appears as a promising means to progress towards this goal. However, state-of-the-art parallel and distributed file systems have not been designed with malleability in mind. This is mainly due to the supposedly high cost of storage decommission, which is considered to involve expensive data transfers. Nevertheless, as network and storage technologies evolve, old assumptions on potential bottlenecks can be revisited. In this study, we evaluate the viability of malleability as a design principle for a distributed file system. We specifically model the duration of the decommission operation, for which we obtain a theoretical lower bound. Then we consider HDFS as a use case and we show that our model can explain the measured decommission times. The existing decommission mechanism of HDFS is good when the network is the bottleneck, but could be accelerated by up to a factor 3 when the storage is the limiting factor. With the highlights provided by our model, we suggest improvements to speed up decommission in HDFS and we discuss open perspectives for the design of efficient malleable distributed file systems.
引用
收藏
页码:141 / 150
页数:10
相关论文
共 50 条
  • [31] ALGORITHMS FOR FILE REPLICATION IN A DISTRIBUTED SYSTEM
    HAC, A
    JIN, XW
    SOO, JH
    JOURNAL OF SYSTEMS AND SOFTWARE, 1991, 14 (03) : 173 - 181
  • [32] A Survey on Distributed File System Technology
    Blomer, J.
    16TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH (ACAT2014), 2015, 608
  • [33] DiFFS: A scalable distributed file system
    Karamanolis, Christos
    Mahalingam, Mallik
    Muntz, Dan
    Zhang, Zheng
    HP Laboratories Technical Report, 2001, (19):
  • [34] Hadoop Distributed File System for the Grid
    Attebury, Garhan
    Baranovski, Andrew
    Bloom, Ken
    Bockelman, Brian
    Kcira, Dorian
    Letts, James
    Levshina, Tanya
    Lundestedt, Carl
    Martin, Terrence
    Maier, Will
    Pi, Haifeng
    Rana, Abhishek
    Sfiligoi, Igor
    Sim, Alexander
    Thomas, Michael
    Wuerthwein, Frank
    2009 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-5, 2009, : 1056 - +
  • [35] PFS: A distributed and customizable file system
    Bosch, P
    Mullender, S
    PROCEEDINGS OF THE FIFTH INTERNATIONAL WORKSHOP ON OBJECT-ORIENTATION IN OPERATING SYSTEMS, 1996, : 78 - 82
  • [36] Large-scale simulation of replica placement algorithms for a serverless distributed file system
    Douceur, JR
    Wattenhofer, RP
    NINTH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, PROCEEDINGS, 2001, : 311 - 319
  • [37] Distributed file system for, clusters and grids
    Valentin, O
    Lombard, P
    Lebre, A
    Guinet, C
    Denneulin, Y
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2004, 3019 : 1099 - 1104
  • [38] PULSE DISTRIBUTED FILE SYSTEM.
    Tomlinson, G.M.
    Keeffe, D.
    Wang, I.C.
    Wellings, A.J.
    Software - Practice and Experience, 1985, 15 (11) : 1087 - 1101
  • [39] Secure Cloud Distributed File System
    Mar, Kheng Kok
    Hu, ZhengQing
    Law, Chee Yong
    Wang, Meifen
    2016 11TH INTERNATIONAL CONFERENCE FOR INTERNET TECHNOLOGY AND SECURED TRANSACTIONS (ICITST), 2016, : 176 - 181
  • [40] DPFS: A Distributed Parallel File System
    Shen, XH
    Choudhary, A
    PROCEEDINGS OF THE 2001 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2001, : 533 - 541