A MapReduce Optimization Method on Hadoop Cluster

被引:2
|
作者
Wu, Xiaodong [1 ,2 ]
机构
[1] Quanzhou Normal Univ, Fac Math & Comp Sci, Quanzhou, Peoples R China
[2] Fujian Prov Univ, Key Lab Intelligent Comp & Informat Proc, Fujian Prov Key Lab Data Intens Comp, Quanzhou, Peoples R China
关键词
MapReduce; Hadoop; Optimization; Polynomial Regression;
D O I
10.1109/ICIICII.2015.92
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The MapReduce parallel and distributed computing framework has been widely applied in both academia and industry. MapReduce applications are divided into two steps: Map and Reduce. Then, the input data is divided into splits, which can be concurrently processed, and the amount of the splits determines the number of map tasks. In this paper, we present a regression-based method to compute the number of Map tasks as well as Reduce tasks such that the performance of the MapReduce application can be improved. The regression analysis is used to predict the executing time of MapReduce applications. Experimental results show that the proposed optimization method can effectively reduce the execution time of the applications.
引用
收藏
页码:18 / 21
页数:4
相关论文
共 50 条
  • [1] Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster
    Singh, Sudhakar
    Garg, Rakhi
    Mishra, P. K.
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2018, 67 : 348 - 364
  • [2] Performance analysis of MapReduce Programs on Hadoop cluster
    Maurya, Mahesh
    Mahajan, Sunita
    [J]. PROCEEDINGS OF THE 2012 WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES, 2012, : 505 - 510
  • [3] A Hadoop MapReduce Performance Prediction Method
    Song, Ge
    Meng, Zide
    Huet, Fabrice
    Magoules, Frederic
    Yu, Lei
    Lin, Xuelian
    [J]. 2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 820 - 825
  • [4] Performance optimization for short job execution in Hadoop MapReduce
    Gu, Rong
    Yan, Jinshuang
    Yang, Xiaoliang
    Yuan, Chunfeng
    Huang, Yihua
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2014, 51 (06): : 1270 - 1280
  • [5] Phase-Reconfigurable Shuffle Optimization for Hadoop MapReduce
    Wang, Jihe
    Qiu, Meikang
    Guo, Bing
    Zong, Ziliang
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (02) : 418 - 431
  • [6] Performance Optimization for Short MapReduce Job Execution in Hadoop
    Yan, Jinshuang
    Yang, Xiaoliang
    Gu, Rong
    Yuan, Chunfeng
    Huang, Yihua
    [J]. SECOND INTERNATIONAL CONFERENCE ON CLOUD AND GREEN COMPUTING / SECOND INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AND ITS APPLICATIONS (CGC/SCA 2012), 2012, : 688 - 694
  • [7] Performance Analysis of MapReduce on OpenStack-based Hadoop Virtual Cluster
    Ahmad, Nazrul M.
    Yaacob, Asrul Hadi
    Amin, Anang Hudaya Muhamad
    Kannan, Subarmaniam
    [J]. 2014 IEEE 2ND INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATION TECHNOLOGIES (ISTT), 2014, : 132 - 137
  • [8] Using a Tunable Knob for Reducing Makespan of MapReduce Jobs in a Hadoop Cluster
    Yao, Yi
    Wang, Jiayin
    Sheng, Bo
    Mi, Ningfang
    [J]. 2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 1 - 8
  • [9] Observations on Factors Affecting Performance of MapReduce based Apriori on Hadoop Cluster
    Singh, Sudhakar
    Garg, Rakhi
    Mishra, P. K.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 87 - 94
  • [10] A Performance Comparison of Apache Tez and MapReduce with Data Compression on Hadoop Cluster
    Rattanaopas, Kritwara
    [J]. PROCEEDINGS OF 2017 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2017,