How Fast Can One Scale Down a Distributed File System?

被引:0
|
作者
Cheriere, Nathanael [1 ]
Antoniu, Gabriel [2 ]
机构
[1] ENS Rennes, IRISA, Rennes, France
[2] INRIA, Rennes Bretagne Atlantique Res Ctr, Rennes, France
关键词
Elastic Storage; Distributed File System; Malleable File System; Model; Decommission;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For efficient Big Data processing, efficient resource utilization becomes a major concern as large-scale computing infrastructures such as supercomputers or clouds keep growing in size. Naturally, energy and cost savings can be obtained by reducing idle resources. Malleability, which is the possibility for resource managers to dynamically increase or reduce the resources of jobs, appears as a promising means to progress towards this goal. However, state-of-the-art parallel and distributed file systems have not been designed with malleability in mind. This is mainly due to the supposedly high cost of storage decommission, which is considered to involve expensive data transfers. Nevertheless, as network and storage technologies evolve, old assumptions on potential bottlenecks can be revisited. In this study, we evaluate the viability of malleability as a design principle for a distributed file system. We specifically model the duration of the decommission operation, for which we obtain a theoretical lower bound. Then we consider HDFS as a use case and we show that our model can explain the measured decommission times. The existing decommission mechanism of HDFS is good when the network is the bottleneck, but could be accelerated by up to a factor 3 when the storage is the limiting factor. With the highlights provided by our model, we suggest improvements to speed up decommission in HDFS and we discuss open perspectives for the design of efficient malleable distributed file systems.
引用
收藏
页码:141 / 150
页数:10
相关论文
共 50 条
  • [1] How fast can one resize a distributed file system?
    Cheriere, Nathanael
    Dorier, Matthieu
    Antoniu, Gabriel
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 140 : 80 - 98
  • [2] SCALE AND PERFORMANCE IN A DISTRIBUTED FILE SYSTEM
    HOWARD, JH
    KAZAR, ML
    MENEES, SG
    NICHOLS, DA
    SATYANARAYANAN, M
    SIDEBOTHAM, RN
    WEST, MJ
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1988, 6 (01): : 51 - 81
  • [3] How fast can one arbitrarily and precisely scale images?
    Bilevich, Leonid
    Yaroslavsky, Leonid
    REAL-TIME IMAGE AND VIDEO PROCESSING 2013, 2013, 8656
  • [4] How Fast can a Distributed Transaction Commit?
    Guerraoui, Rachid
    Wang, Jingjing
    PODS'17: PROCEEDINGS OF THE 36TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2017, : 107 - 122
  • [5] THE INFLUENCE OF SCALE ON DISTRIBUTED FILE SYSTEM-DESIGN
    SATYANARAYANAN, M
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1992, 18 (01) : 1 - 8
  • [6] Fast and secure distributed read-only file system
    Fu, K
    Kaashoek, MF
    Mazières, D
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2002, 20 (01): : 1 - 24
  • [7] Fast and secure distributed read-only file system
    Fu, K
    Kaashoek, MF
    Mazières, D
    USENIX ASSOCIATION PROCEEDINGS OF THE FOURTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2000, : 181 - 196
  • [8] CFS: A Distributed File System for Large Scale Container Platforms
    Liu, Haifeng
    Ding, Wei
    Chen, Yuan
    Guo, Weilong
    Liu, Shuoran
    Li, Tianpeng
    Zhang, Mofei
    Zhao, Jianxing
    Zhu, Hongyin
    Zhu, Zhengyi
    SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1729 - 1742
  • [9] HOW CAN ONE GO FURTHER WITH THE FAST BREEDERS
    EITZ, AW
    ATOMWIRTSCHAFT-ATOMTECHNIK, 1985, 30 (8-9): : R3 - R3
  • [10] Design and development of All-in-one computer for distributed file system
    Qian, Lin
    Chen, Yan
    Yu, Jun
    Zhu, Guangxin
    Pang, Hengmao
    Li, Xigao
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND MANAGEMENT INNOVATION, 2015, 28 : 701 - 706