HotROD: Managing Grid Storage with On-Demand Replication

被引:0
|
作者
Rao, Sriram [1 ]
Reed, Benjamin [1 ]
Silberstein, Adam [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Enterprises (such as, Yahoo!, LinkedIn, Facebook) operate their own compute/storage infrastructure, which is effectively a "private cloud". The private cloud consists of multiple clusters, each of which is managed independently. With HDFS, whenever data is stored in the cluster, it is replicated within the cluster for availability. Unfortunately, for datasets shared across the enterprise, this leads to the problem of over-replication within the private cloud. An analysis of Yahoo!'s HDFS usage suggests that the disk space consumed due to replication of shared datasets is substantial (viz., to the tune of PB's of storage). New data sets are typically popular and requested by many processing jobs in (different) clusters. This demand is satisfied by copying the dataset to each of the clusters. As data sets age, however, they get used less and become cold. We then have the opposite problem of having data overreplicated across clusters: each cluster has enough replicas to recover from data loss locally, and the sum total of replicas is high. We address both the problems of initially replicating data and cross cluster recovery in a private cloud setting using the same technique: on-demand replication, which we refer to as Hot Replication On-Demand (HotROD). By making files visible across HDFS clusters, we let a cluster pull in remote replicas as needed, both for initial replication and later recovery. We implemented HotROD as an extension to a standard HDFS installation.
引用
收藏
页码:243 / 249
页数:7
相关论文
共 50 条
  • [1] On-demand grid storage using scavenging
    Vazhkudai, S
    [J]. PDPTA '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-3, 2004, : 554 - 560
  • [2] Secure on-demand grid computing
    Smith, M.
    Schmidt, M.
    Fallenbeck, N.
    Doernemann, T.
    Schridde, C.
    Freisleben, B.
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (03): : 315 - 325
  • [3] Managing uncertainty in on-demand air travel
    Yang, Wei
    Karaesmen, Itir Z.
    Keskinocak, Pinar
    [J]. TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW, 2010, 46 (06) : 1169 - 1179
  • [4] Ensuring Security in On-demand File Replication System
    Bajpai, Durgesh
    Vardhan, Manu
    Kushwaha, Dharmender Singh
    [J]. 2012 THIRD INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGY (ICCCT), 2012, : 315 - 320
  • [5] MANAGING ON-DEMAND COMPUTING SERVICES WITH HETEROGENEOUS CUSTOMERS
    Yahav, Inbal
    Karaesmen, Itir
    Raschid, Louiqa
    [J]. 2013 WINTER SIMULATION CONFERENCE (WSC), 2013, : 5 - +
  • [6] On-Demand Recovery in Middleware Storage Systems
    Camargos, Lasaro
    Pedone, Fernando
    Pilchin, Alex
    Wieloch, Marcin
    [J]. 2010 29TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS SRDS 2010, 2010, : 204 - 213
  • [7] InstantGrid: A framework for on-demand grid point construction
    Ho, RSC
    Yin, KK
    Lee, DCM
    Hung, DHF
    Wang, CL
    Lau, FCM
    [J]. GRID AND COOPERATIVE COMPUTING GCC 2004, PROCEEDINGS, 2004, 3251 : 911 - 914
  • [8] On-Demand Cognitive Radio Communications for Smart Grid
    Jiang, Tigang
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON SMART GRID COMMUNICATIONS (SMARTGRIDCOMM), 2016,
  • [9] On-demand service hosting on production grid infrastructures
    Lizhe Wang
    Tobias Kurze
    Jie Tao
    Marcel Kunze
    Gregor von Laszewski
    [J]. The Journal of Supercomputing, 2013, 66 : 1178 - 1193
  • [10] On-demand service hosting on production grid infrastructures
    Wang, Lizhe
    Kurze, Tobias
    Tao, Jie
    Kunze, Marcel
    von Laszewski, Gregor
    [J]. JOURNAL OF SUPERCOMPUTING, 2013, 66 (03): : 1178 - 1193