Analysis of HDFS RPC and Hadoop with RDMA by Evaluating Write Performance

被引:0
|
作者
Singh, Somya [1 ]
Raj, Gaurav [1 ]
Kaur, Gurneet [2 ]
机构
[1] Amity Univ, CSE Dept, Noida, Uttar Pradesh, India
[2] Verizon Data Serv India, Madras, Tamil Nadu, India
关键词
Big Data; Hadoop; HDFS; Replication Factor; RDMA based Hadoop;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the era of data explosion, one of the crowd pleasing words is Big Data. For large scale data handling and processing, Hadoop is in the mainstream. The Hadoop Distributed File Storage deals with the issue of data availability by replicating the data over multiple servers. When the data has to be written over multiple remote locations, the write performance is a major consideration. The objective of this paper is to evaluate the HDFS write performance using both the replication schemes, i.e., the default Pipelined Replication and Parallel Replication. The TestDFSIO Benchmark is used for benchmarking the Hadoop cluster with both replication schemes. It is observed that the write throughput of the Hadoop cluster using the parallel replication is 8% more than that of pipeline replication. It is also observed that the performance of RDMA based HDFS is far better than HDFS RPC.
引用
收藏
页码:368 / 372
页数:5
相关论文
共 50 条
  • [1] High-Performance Design of Hadoop RPC with RDMA over InfiniBand
    Lu, Xiaoyi
    Islam, Nusrat S.
    Wasi-ur-Rahman, Md
    Jose, Jithin
    Subramoni, Hari
    Wang, Hao
    Panda, Dhabaleswar K.
    [J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 641 - 650
  • [2] Performance Modeling for RDMA-Enhanced Hadoop MapReduce
    Wasi-ur-Rahman, Md.
    Lu, Xiaoyi
    Islam, Nusrat Sharmin
    Panda, Dhabaleswar K.
    [J]. 2014 43RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2014, : 50 - 59
  • [3] High Performance RDMA-based Design of HDFS over InfiniBand
    Islam, N. S.
    Rahman, M. W.
    Jose, J.
    Rajachandrasekar, R.
    Wang, H.
    Subramoni, H.
    Murthy, C.
    Panda, D. K.
    [J]. 2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [4] Performance models and dynamic characteristics analysis for HDFS write and read operations: A systematic view
    Dong, Bo
    Zheng, Qinghua
    Tian, Feng
    Chao, Kuo-Ming
    Godwin, Nick
    Ma, Tian
    Xu, Haipeng
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2014, 93 : 132 - 151
  • [5] Improving HDFS Write Performance Using Efficient Replica Placement
    Neha, Patel M.
    Narendra, Patel M.
    Hasan, Mosin I.
    Parth, Shah D.
    Mayur, Patel M.
    [J]. 2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 36 - 39
  • [6] iShuffle: Improving Hadoop Performance with Shuffle-on-Write
    Guo, Yanfei
    Rao, Jia
    Cheng, Dazhao
    Zhou, Xiaobo
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (06) : 1649 - 1662
  • [7] Study on RDMA-based high performance RPC in multi-core systems
    Li B.
    Meng D.
    Huo Z.
    [J]. Gaojishu Tongxin/Chinese High Technology Letters, 2011, 21 (07): : 681 - 686
  • [8] Analysis and Experimental Study of HDFS Performance
    Kalmukov, Yordan
    Marinov, Milko
    Mladenova, Tsvetelina
    Valova, Irena
    [J]. TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2021, 10 (02): : 806 - 814
  • [9] Implementing WebGIS on Hadoop: A Case Study of Improving Small File I/O Performance on HDFS
    Liu, Xuhui
    Han, Jizhong
    Zhong, Yunqin
    Han, Chengde
    He, Xubin
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING AND WORKSHOPS, 2009, : 429 - +
  • [10] Performance Evaluation of Read and Write Operations in Hadoop Distributed File System
    Krishna, T. Lakshmi Siva Rama
    Ragunathan, T.
    Battula, Sudheer Kumar
    [J]. 2014 SIXTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2014, : 110 - 113