Multicast-based Replication for Hadoop HDFS

被引:0
|
作者
Wu, Jiadong [1 ]
Hong, Bo [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
MAPREDUCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Hadoop HDFS is a popular open-source distributed storage system, which serves as the foundation of many important big-data technologies. The performance of data replication is crucial to HDFS, since it accounts for a major portion of network traffic in the entire cluster. In this research, we propose to enable multicast-based replication, which is expected to use less network bandwidth than the native TCP-based pipelined replication method. We developed a congestion-controlled reliable multicast socket (the CCRMSocket) for HDFS and evaluated its performance with our multi-rack test platform. The experimental result shows that our multicast implementation can effectively save bandwidth and peacefully coexist with TCP traffic. We also developed a simulator (the HFlowSim) to further study the impact of multicast-based replication to a large-scale Hadoop system. The simulation result suggests that multicast-based replication can systematically improve a Hadoop system by accelerating the big jobs.
引用
收藏
页码:143 / 148
页数:6
相关论文
共 50 条
  • [31] Security analysis and concept for the multicast-based handover support architecture MOMBASA
    Westerhoff, L
    Reinhardt, S
    Schäfer, G
    Wolisz, A
    [J]. GLOBECOM '04: IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, VOLS 1-6, 2004, : 2201 - 2207
  • [32] A Multicast-Based Data Dissemination to Maintain Cache Consistency in Mobile Environment
    Tabassum, Kahkashan
    Damodaram, A.
    [J]. ADVANCES IN NETWORKS AND COMMUNICATIONS, PT II, 2011, 132 : 290 - 301
  • [33] 关于Hadoop中HDFS的研究
    刘涌
    裴春梅
    韩伟
    高震宇
    [J]. 电脑知识与技术, 2018, 14 (01) : 7 - 8
  • [34] A RAM triage methodology for Hadoop HDFS forensics
    Leimich, Petra
    Harrison, Josh
    Buchanan, William J.
    [J]. DIGITAL INVESTIGATION, 2016, 18 : 96 - 109
  • [35] A Markov random field approach to multicast-based network inference problems
    Ni, Jian
    Tatikonda, Sekhar
    [J]. 2006 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, VOLS 1-6, PROCEEDINGS, 2006, : 2769 - +
  • [36] A multicast-based micro mobility scheme in IPv6 network
    Han, JJ
    Woo, M
    [J]. ICWN'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS, 2003, : 175 - 181
  • [37] Building a Version Control System in the Hadoop HDFS
    Yeh, Tsozen
    Chien, Tingyu
    [J]. NOMS 2018 - 2018 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2018,
  • [38] Hadoop HDFS和MapReduce架构浅析
    郝树魁
    [J]. 邮电设计技术, 2012, (07) : 37 - 42
  • [39] Placement Scheduling for Replication in HDFS Based on Probabilistic Approach
    Bui, Dinh-Mao
    Lee, Sungyoung
    [J]. INCLUSIVE SMART CITIES AND DIGITAL HEALTH, 2016, 9677 : 314 - 320
  • [40] Supervised Learning based HDFS Replication Management System
    Ilakiyaa, R.
    Nalini, N. J.
    [J]. 2017 INTERNATIONAL CONFERENCE ON TECHNICAL ADVANCEMENTS IN COMPUTERS AND COMMUNICATIONS (ICTACC), 2017, : 116 - 120