High-Performance Design of Hadoop RPC with RDMA over InfiniBand

被引:60
|
作者
Lu, Xiaoyi [1 ]
Islam, Nusrat S. [1 ]
Wasi-ur-Rahman, Md [1 ]
Jose, Jithin [1 ]
Subramoni, Hari [1 ]
Wang, Hao [1 ]
Panda, Dhabaleswar K. [1 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
关键词
D O I
10.1109/ICPP.2013.78
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Hadoop RPC is the basic communication mechanism in the Hadoop ecosystem. It is used with other Hadoop components like MapReduce, HDFS, and HBase in real world data-centers, e.g. Facebook and Yahoo!. However, the current Hadoop RPC design is built on Java sockets interface, which limits its potential performance. The High Performance Computing community has exploited high throughput and low latency networks such as InfiniBand for many years. In this paper, we first analyze the performance of current Hadoop RPC design by unearthing buffer management and communication bottlenecks, that are not apparent on the slower speed networks. Then we propose a novel design (RPCoIB) of Hadoop RPC with RDMA over InfiniBand networks. RPCoIB provides a JVM-bypassed buffer management scheme and utilizes message size locality to avoid multiple memory allocations and copies in data serialization and deserialization. Our performance evaluations reveal that the basic ping-pong latencies for varied data sizes are reduced by 42%-49% and 46%-50% compared with 10GigE and IPoIB QDR (32 Gbps), respectively, while the RPCoIB design also improves the peak throughput by 82% and 64% compared with 10GigE and IPoIB. As compared to default Hadoop over IPoIB QDR, our RPCoIB design improves the performance of the Sort benchmark on 64 compute nodes by 15%, while it improves the performance of CloudBurst application by 10%. We also present thorough, integrated evaluations of our RPCoIB design with other research directions, which optimize HDFS and HBase using RDMA over InfiniBand. Compared with their best performance, we observe 10% improvement for HDFS-IB, and 24% improvement for HBase-IB. To the best of our knowledge, this is the first such design of the Hadoop RPC system over high performance networks such as InfiniBand.
引用
收藏
页码:641 / 650
页数:10
相关论文
共 50 条
  • [1] High-performance design of hbase with RDMA over InfiniBand
    Huang, Jian
    Ouyang, Xiangyong
    Jose, Jithin
    Wasi-Ur-Rahman, Md.
    Wang, Hao
    Luo, Miao
    Subramoni, Hari
    Murthy, Chet
    Panda, Dhabaleswar K.
    [J]. Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012, 2012, : 774 - 785
  • [2] High-Performance Design of HBase with RDMA over InfiniBand
    Huang, Jian
    Ouyang, Xiangyong
    Jose, Jithin
    Wasi-ur-Rahman, Md
    Wang, Hao
    Luo, Miao
    Subramoni, Hari
    Murthy, Chet
    Panda, Dhabaleswar K.
    [J]. 2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 774 - 785
  • [3] High Performance RDMA-based Design of HDFS over InfiniBand
    Islam, N. S.
    Rahman, M. W.
    Jose, J.
    Rajachandrasekar, R.
    Wang, H.
    Subramoni, H.
    Murthy, C.
    Panda, D. K.
    [J]. 2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [4] Designing a High-Performance Clustered NAS: A Case Study with pNFS over RDMA on InfiniBand
    Noronha, Ranjit
    Ouyang, Xiangyong
    Panda, Dhabaleswar K.
    [J]. High Performance Computing - HiPC 2008, Proceedings, 2008, 5374 : 465 - 477
  • [5] Analysis of HDFS RPC and Hadoop with RDMA by Evaluating Write Performance
    Singh, Somya
    Raj, Gaurav
    Kaur, Gurneet
    [J]. 2016 6TH INTERNATIONAL CONFERENCE - CLOUD SYSTEM AND BIG DATA ENGINEERING (CONFLUENCE), 2016, : 368 - 372
  • [6] High Performance RDMA-Based MPI Implementation over InfiniBand
    Jiuxing Liu
    Jiesheng Wu
    Dhabaleswar K. Panda
    [J]. International Journal of Parallel Programming, 2004, 32 : 167 - 198
  • [7] High performance RDMA-based MPI implementation over InfiniBand
    Liu, JX
    Wu, JS
    Panda, DK
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2004, 32 (03) : 167 - 198
  • [8] Accelerating Redis with RDMA Over InfiniBand
    Tang, Wenhui
    Lu, Yutong
    Xiao, Nong
    Liu, Fang
    Chen, Zhiguang
    [J]. DATA MINING AND BIG DATA, DMBD 2017, 2017, 10387 : 472 - 483
  • [9] A High Performance Broadcast Design with Hardware Multicast and GPUDirect RDMA for Streaming Applications on Infiniband Clusters
    Venkatesh, A.
    Subramoni, H.
    Hamidouche, K.
    Panda, Dhabaleswar K.
    [J]. 2014 21ST INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2014,
  • [10] RWAPI over InfiniBand: Design and performance
    Ben Fredj, Ouissem
    Renault, Eric
    [J]. ISPDC 2006: FIFTH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING, PROCEEDINGS, 2006, : 50 - +