Using Realistic Simulation for Performance Analysis of MapReduce Setups

被引:0
|
作者
Wang, Guanying [1 ]
Butt, Ali R. [1 ]
Pandey, Prashant
Gupta, Karan
机构
[1] Virginia Tech, Blacksburg, VA 24061 USA
基金
美国国家科学基金会;
关键词
Cloud Computing; Hadoop; Simulation; MapReduce;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, there has been a huge growth in the amount of data processed by enterprises and the scientific computing community. Two promising trends ensure that applications will be able to deal with ever increasing data Volumes: First, the emergence of cloud computing, which provides transparent access to a large number of compute, storage and networking resources; and second, the development of the MapReduce programming model, which provides a high-level abstraction for data-intensive computing. However, the design space of these systems has not been explored in detail. Specifically, the impact of various design choices and run-time parameters of a MapReduce system oil application performance remains an open question. To this end, we embarked oil systematically understanding the performance of MapReduce systems, but soon realized that understanding effects of parameter tweaking in a large-scale setup with many variables was impractical. Consequently, in this paper, we present the design of an accurate MapReduce simulator, MRPerf, for facilitating exploration of MapReduce design space. MRPerf captures various aspects of a MapReduce setup, and uses this information to predict expected application performance. In essence, MR-Perf can serve as a design tool for MapReduce infrastructure, and as a planning tool for making MapReduce deployment far easier via reduction in the number of parameters that currently have to be hand-tuned using rules of thumb. Our validation of MRPerf using data from medium-scale production clusters shows that it is able to predict application performance accurately, and thus can be a useful tool in enabling cloud computing. Moreover, an initial application of MRPerf to our test clusters running Hadoop, revealed a performance bottleneck, fixing which resulted in up to 28.05% performance improvement.
引用
收藏
页码:19 / 26
页数:8
相关论文
共 50 条
  • [1] A Simulation Approach to Evaluating Design Decisions in MapReduce Setups
    Wang, Guanying
    Butt, Ali R.
    Pandey, Prashant
    Gupta, Karan
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS & SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS), 2009, : 5 - +
  • [2] Cooperative Spectrum Sensing for Cognitive Radios: Performance Analysis for Realistic System Setups and Channel Conditions
    Di Renzo, Marco
    Imbriglio, Laura
    Graziosi, Fabio
    Santucci, Fortunato
    Verikoukis, Christos
    [J]. MOBILE LIGHTWEIGHT WIRELESS SYSTEMS, 2009, 13 : 125 - +
  • [3] Research techniques in human performance using realistic simulation
    Gaba, DM
    [J]. SIMULATORS IN ANESTHESIOLOGY EDUCATION, 1998, : 93 - 102
  • [4] ANALYSIS OF CEDBT AND CESM PERFORMANCE USING A REALISTIC X-RAY SIMULATION PLATFORM
    de la Rosa, R. Sanchez
    Carton, A-K
    de Carvalho, P. Milioni
    Bloch, I
    Muller, S.
    [J]. 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 1070 - 1073
  • [5] Realistic Performance Analysis of WSN Protocols Through Trace Based Simulation
    Marchiori, Alan
    Guo, Lin
    Thomas, Josh
    Han, Qi
    [J]. PE-WASUN 2010: PROCEEDINGS OF THE SEVENTH ACM SYMPOSIUM ON PERFORMANCE EVALUATION OF WIRELESS AD HOC, SENSOR, AND UBIQUITOUS NETWORKS, 2010, : 87 - 94
  • [6] Improving Encryption Performance using MapReduce
    Desai, Sanket
    Park, Younghee
    Gao, Jerry
    Chang, Sang-Yoon
    Song, Chungsik
    [J]. 2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1350 - 1355
  • [7] Analysis, Modeling, and Simulation of Hadoop YARN MapReduce
    Bressoud, Thomas C.
    Tang, Qiuyi
    [J]. 2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2016, : 980 - 988
  • [8] Performance Analysis Using Petri Net Based MapReduce Model in Heterogeneous Clusters
    Cheng, Sheng-Tzong
    Wang, Hsi-Chuan
    Chen, Yin-Jun
    Chen, Chen-Fei
    [J]. ADVANCES IN WEB-BASED LEARNING, 2015, 8390 : 170 - 179
  • [9] Realistic three dimensional simulation on the performance of micromegas
    Bhattacharya, Purba
    Mukhopadhayay, Supratik
    Majumdar, Nayana
    Bhattacharya, Sudeb
    [J]. NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2011, 628 (01): : 465 - 469
  • [10] Connecting MapReduce Computations to Realistic Machine Models
    Sanders, Peter
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 84 - 93