Multicast-based Replication for Hadoop HDFS

被引:0
|
作者
Wu, Jiadong [1 ]
Hong, Bo [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
MAPREDUCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Hadoop HDFS is a popular open-source distributed storage system, which serves as the foundation of many important big-data technologies. The performance of data replication is crucial to HDFS, since it accounts for a major portion of network traffic in the entire cluster. In this research, we propose to enable multicast-based replication, which is expected to use less network bandwidth than the native TCP-based pipelined replication method. We developed a congestion-controlled reliable multicast socket (the CCRMSocket) for HDFS and evaluated its performance with our multi-rack test platform. The experimental result shows that our multicast implementation can effectively save bandwidth and peacefully coexist with TCP traffic. We also developed a simulator (the HFlowSim) to further study the impact of multicast-based replication to a large-scale Hadoop system. The simulation result suggests that multicast-based replication can systematically improve a Hadoop system by accelerating the big jobs.
引用
收藏
页码:143 / 148
页数:6
相关论文
共 50 条
  • [1] A scalable framework for content replication in multicast-based content distribution networks
    Matalas, Yannis
    Dragios, Nikolaos D.
    Karetsos, George T.
    [J]. AUTONOMIC MANAGEMENT OF MOBILE MULTIMEDIA SERVICES, PROCEEDINGS, 2006, 4267 : 110 - 115
  • [2] Multicast-based measurement of network delay
    Lu, Guang-Hui
    Sun, Shi-Xin
    Shao, Zi-Li
    Zhang, Yan
    [J]. 1704, Chinese Academy of Sciences (12):
  • [3] Proactive multicast-based IPSEC Discovery Protocol and multicast extension
    Tran, Trung H.
    [J]. MILCOM 2006, VOLS 1-7, 2006, : 1339 - 1345
  • [4] mNFS: Multicast-based NFS cluster
    Lee, WG
    Park, CI
    Kim, DW
    [J]. COMPUTATIONAL SCIENCE - ICCS 2004, PT 3, PROCEEDINGS, 2004, 3038 : 363 - 370
  • [5] The design of a multicast-based distributed file system
    Grönvall, B
    Westerlund, A
    Pink, S
    [J]. USENIX ASSOCIATION PROCEEDINGS OF THE THIRD SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '99), 1999, : 251 - 264
  • [6] Interconnection of wireless cells - a multicast-based approach
    Festag, A
    Weinmiller, J
    Wolisz, A
    [J]. 1997 IEEE 6TH INTERNATIONAL CONFERENCE ON UNIVERSAL PERSONAL COMMUNICATIONS RECORD, CONFERENCE RECORD, VOLS 1 AND 2, 1997, : 552 - 556
  • [7] Multicast-based loss inference with missing data
    Duffield, NG
    Horowitz, J
    Towsley, D
    Wei, W
    Friedman, T
    [J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2002, 20 (04) : 700 - 713
  • [8] Application of Multicast-based Video Conference on CERNET Backbone
    ZHANG Xuan
    [J]. 计算机工程, 2002, (S1) : 133 - 136
  • [9] A multicast-based handoff for seamless connection in picocellular networks
    Ha, E
    Choi, Y
    Kim, C
    [J]. APCCAS '96 - IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS '96, 1996, : 167 - 170
  • [10] On multicast-based binding update scheme for NEMO environments
    Kim, Moonseong
    Radha, Hayder
    Choo, Hyunseung
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCES AND ITS APPLICATIONS, PROCEEDINGS, 2008, : 3 - +