Using Realistic Simulation for Performance Analysis of MapReduce Setups

被引:0
|
作者
Wang, Guanying [1 ]
Butt, Ali R. [1 ]
Pandey, Prashant
Gupta, Karan
机构
[1] Virginia Tech, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
Cloud Computing; Hadoop; Simulation; MapReduce;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, there has been a huge growth in the amount of data processed by enterprises and the scientific computing community. Two promising trends ensure that applications will be able to deal with ever increasing data Volumes: First, the emergence of cloud computing, which provides transparent access to a large number of compute, storage and networking resources; and second, the development of the MapReduce programming model, which provides a high-level abstraction for data-intensive computing. However, the design space of these systems has not been explored in detail. Specifically, the impact of various design choices and run-time parameters of a MapReduce system oil application performance remains an open question. To this end, we embarked oil systematically understanding the performance of MapReduce systems, but soon realized that understanding effects of parameter tweaking in a large-scale setup with many variables was impractical. Consequently, in this paper, we present the design of an accurate MapReduce simulator, MRPerf, for facilitating exploration of MapReduce design space. MRPerf captures various aspects of a MapReduce setup, and uses this information to predict expected application performance. In essence, MR-Perf can serve as a design tool for MapReduce infrastructure, and as a planning tool for making MapReduce deployment far easier via reduction in the number of parameters that currently have to be hand-tuned using rules of thumb. Our validation of MRPerf using data from medium-scale production clusters shows that it is able to predict application performance accurately, and thus can be a useful tool in enabling cloud computing. Moreover, an initial application of MRPerf to our test clusters running Hadoop, revealed a performance bottleneck, fixing which resulted in up to 28.05% performance improvement.
引用
收藏
页码:19 / 26
页数:8
相关论文
共 50 条
  • [41] A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop
    Pal, Amrit
    Agrawal, Pinki
    Jain, Kunal
    Agrawal, Sanjay
    2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 587 - 591
  • [42] On the Performance Projectability of MapReduce
    Xie, Di
    Hu, Y. Charlie
    Kompella, Ramana Rao
    2012 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2012,
  • [43] A Simulation Study of Routing Performance in Realistic Urban Scenarios for MANETs
    Di Caro, Gianni A.
    Ducatelle, Frederick
    Gambardella, Luca M.
    ANT COLONY OPTIMIZATION AND SWARM INTELLIGENCE, PROCEEDINGS, 2008, 5217 : 211 - 218
  • [44] A simulation tool for performance assessment of realistic mobile radio networks
    Zuliani, L
    Zanella, A
    Marazzi, A
    Moretti, R
    Agrati, E
    Verdone, R
    Andrisano, O
    2004 IEEE 15TH INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, VOLS 1-4, PROCEEDINGS, 2004, : 869 - 874
  • [45] Towards Improved Overlay Simulation Using Realistic Topologies
    Pfeifer, Gert
    Spring, Ryan C.
    Fetzer, Christof
    2009 8TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS, 2009, : 52 - 59
  • [46] Realistic Respiratory Motion Simulation Using Deep Learning
    Lee, D.
    Nadeem, S.
    Hu, Y.
    MEDICAL PHYSICS, 2022, 49 (06) : E526 - E526
  • [47] Realistic Social Networks for Simulation using Network Rewiring
    Dekker, A. H.
    MODSIM 2007: INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION: LAND, WATER AND ENVIRONMENTAL MANAGEMENT: INTEGRATED SYSTEMS FOR SUSTAINABILITY, 2007, : 677 - +
  • [48] Simulation of Inference Accuracy Using Realistic RRAM Devices
    Mehonic, Adnan
    Joksas, Dovydas
    Ng, Wing H.
    Buckwell, Mark
    Kenyon, Anthony J.
    FRONTIERS IN NEUROSCIENCE, 2019, 13
  • [49] Realistic simulation of ocean surface using wave spectra
    Frechot, Jocelyn
    GRAPP 2006: Proceedings of the First International Conference on Computer Graphics Theory and Applications, 2006, : 76 - 83
  • [50] More realistic performance analysis for SDMA systems
    Fang, XM
    IEEE 54TH VEHICULAR TECHNOLOGY CONFERENCE, VTC FALL 2001, VOLS 1-4, PROCEEDINGS, 2001, : 1533 - 1537