Analyzing & Optimizing Hadoop Performance

被引:0
|
作者
Jain, Ankita [1 ]
Choudhary, Monika [1 ]
机构
[1] Indira Gandhi Tech Univ Women, Dept Comp Sci, Delhi, India
关键词
Hadoop; BigData; MapReduce; Configuration parameter; Performance Optimization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Processing and analyzing BigData in timely and cost effective manner is an important but tedious job. It is highly desired that any BigData management framework must process and analyze data within a fraction of seconds. There are various tools available for this purpose; one of such open source tool is Apache Hadoop. This tool is widely adopted for storage and management of BigData. Several methods have been suggested and implemented to analyze and enhance the Hadoop performance. This paper particularly focuses upon tuning configuration parameter approach. Although there are several configuration parameters which affect the performance of Hadoop, among that MapReduce related parameter has a significant impact. The objective of this work is to enhance the overall performance of Hadoop by reducing job execution time. Reduction in time is obtained by tuning some of MapReduce associated parameters. The right understanding of these parameters is very crucial since varying parameters with improper values can show a negative impact on overall performance. In this paper, proposed approach saves job execution time and optimizes disk usage efficiently. It significantly improves the overall performance of Hadoop by 38.51% over the base system in the heterogeneous environment.
引用
收藏
页码:116 / 121
页数:6
相关论文
共 50 条
  • [1] Optimizing Performance of Hadoop with Parameter Tuning
    Chen, Xiang
    Liang, Yi
    Li, Guang-Rui
    Chen, Cheng
    Liu, Si-Yu
    4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12
  • [2] Hadoop MapReduce Performance on SSDs for Analyzing Social Networks
    Bakratsas, M.
    Basaras, P.
    Katsaros, D.
    Tassiulas, L.
    BIG DATA RESEARCH, 2018, 11 : 1 - 10
  • [3] Performance Enhancement of Hadoop MapReduce Framework for Analyzing BigData
    Prabhu, Swathi
    Rodrigues, Anisha P.
    Prasad, Guru M. S.
    Nagesh, H. R.
    2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [4] Optimizing Hadoop Performance for Big Data Analytics in Smart Grid
    Khan, Mukhtaj
    Huang, Zhengwen
    Li, Maozhen
    Taylor, Gareth A.
    Ashton, Phillip M.
    Khan, Mushtaq
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
  • [5] An Open Source Project for Tuning and Analyzing MapReduce Performance in Hadoop and Spark
    Chen, Donghua
    Zhang, Runtong
    IEEE SOFTWARE, 2022, 39 (01) : 61 - 69
  • [6] An Empirical Performance Analysis on Hadoop via Optimizing the Network Heartbeat Period
    Lee, Jaehwan
    Choi, June
    Roh, Hongchan
    Shin, Ji Sun
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (11): : 5252 - 5268
  • [7] Optimizing the Hadoop MapReduce Framework with high-performance storage devices
    Moon, Sangwhan
    Lee, Jaehwan
    Sun, Xiling
    Kee, Yang-suk
    JOURNAL OF SUPERCOMPUTING, 2015, 71 (09): : 3525 - 3548
  • [8] Optimizing the Hadoop MapReduce Framework with high-performance storage devices
    Sangwhan Moon
    Jaehwan Lee
    Xiling Sun
    Yang-suk Kee
    The Journal of Supercomputing, 2015, 71 : 3525 - 3548
  • [9] Analyzing performance of Apache Tez and MapReduce with hadoop multinode cluster on Amazon cloud
    Singh R.
    Kaur P.J.
    Journal of Big Data, 3 (1)
  • [10] SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters
    Gu, Rong
    Yang, Xiaoliang
    Yan, Jinshuang
    Sun, Yuanhao
    Wang, Bing
    Yuan, Chunfeng
    Huang, Yihua
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (03) : 2166 - 2179