MREv: an Automatic MapReduce Evaluation Tool for Big Data Workloads

被引:6
|
作者
Veiga, Jorge [1 ]
Exposito, Roberto R. [1 ]
Taboada, Guillermo L. [1 ]
Tourino, Juan [1 ]
机构
[1] Univ A Coruna, Comp Architecture Grp, La Coruna, Spain
关键词
High Performance Computing (HPC); Big Data; MapReduce; Performance Evaluation; Resource Efficiency; InfiniBand;
D O I
10.1016/j.procs.2015.05.202
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The popularity of Big Data computing models like MapReduce has caused the emergence of many frameworks oriented to High Performance Computing (HPC) systems. The suitability of each one to a particular use case depends on its design and implementation, the underlying system resources and the type of application to be run. Therefore, the appropriate selection of one of these frameworks generally involves the execution of multiple experiments in order to assess their performance, scalability and resource efficiency. This work studies the main issues of this evaluation, proposing a new MapReduce Evaluator (MREv) tool which unifies the configuration of the frameworks, eases the task of collecting results and generates resource utilization statistics. Moreover, a practical use case is described, including examples of the experimental results provided by this tool. MREv is available to download at http://mrev.des.udc.es.
引用
收藏
页码:80 / 89
页数:10
相关论文
共 50 条
  • [21] Memory System Characterization of Big Data Workloads
    Dimitrov, Martin
    Kumar, Karthik
    Lu, Patrick
    Viswanathan, Vish
    Willhalm, Thomas
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [22] Automotive Big Data: Applications, Workloads and Infrastructures
    Luckow, Andre
    Kennedy, Ken
    Manhardt, Fabian
    Djerekarov, Emil
    Vorster, Bennie
    Apon, Amy
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1201 - 1210
  • [23] TideWatch: Fingerprinting the Cyclicality of Big Data Workloads
    Williams, Dan
    Zheng, Shuai
    Zhang, Xiangliang
    Jamjoom, Hani
    [J]. 2014 PROCEEDINGS IEEE INFOCOM, 2014, : 2031 - 2039
  • [24] Characterization and Architectural Implications of Big Data Workloads
    Wang, Lei
    Ren, Rui
    Zhan, Jianfeng
    Jia, Zhen
    [J]. 2016 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE ISPASS 2016, 2016, : 145 - 146
  • [25] Evaluation of high-level query languages based on MapReduce in Big Data
    Birjali, Marouane
    Beni-Hssane, Abderrahim
    Erritali, Mohammed
    [J]. JOURNAL OF BIG DATA, 2018, 5 (01)
  • [26] Scheduling Data Intensive Workloads through Virtualization on MapReduce based Clouds
    Rao, B. Thirumala
    Reddy, L. S. S.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2013, 13 (06): : 105 - 112
  • [27] Cross-Cloud MapReduce for Big Data
    Li, Peng
    Guo, Song
    Yu, Shui
    Zhuang, Weihua
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (02) : 375 - 386
  • [28] A Mapreduce Fuzzy Techniques of Big Data Classification
    El Bakry, Malak
    Safwat, Soha
    Hegazy, Osman
    [J]. PROCEEDINGS OF THE 2016 SAI COMPUTING CONFERENCE (SAI), 2016, : 118 - 128
  • [29] Evolving Big Data Stream Classification with MapReduce
    Haque, Ahsanul
    Parker, Brandon
    Khan, Latifur
    Thuraisingham, Bhavani
    [J]. 2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 570 - 577
  • [30] Design of MapReduce and CTA for Big Data System
    Kim, Earl
    Shin, Dong-ryeol
    [J]. 2015 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND ARTIFICIAL INTELLIGENCE (CAAI 2015), 2015, : 294 - 297