Spark Deployment and Performance Evaluation on the MareNostrum Supercomputer

被引:0
|
作者
Tous, Ruben [1 ,2 ]
Gounaris, Anastasios [3 ]
Tripiana, Carlos [1 ]
Torres, Jordi [1 ,2 ]
Girona, Sergi [1 ]
Ayguade, Eduard [1 ,2 ]
Labarta, Jesus [1 ,2 ]
Becerra, Yolanda [1 ,2 ]
Carrera, David [1 ,2 ]
Valero, Mateo [1 ,2 ]
机构
[1] Barcelona Supercomp Ctr, Barcelona, Spain
[2] Univ Politecn Cataluna, Barcelona, Spain
[3] Aristotle Univ Thessaloniki, Thessaloniki, Greece
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present a framework to enable data-intensive Spark workloads on MareNostrum, a petascale supercomputer designed mainly for compute-intensive applications. As far as we know, this is the first attempt to investigate optimized deployment configurations of Spark on a petascale HPC setup. We detail the design of the framework and present some benchmark data to provide insights into the scalability of the system. We examine the impact of different configurations including parallelism, storage and networking alternatives, and we discuss several aspects in executing Big Data workloads on a computing system that is based on the compute-centric paradigm. Further, we derive conclusions aiming to pave the way towards systematic and optimized methodologies for fine-tuning data-intensive application on large clusters emphasizing on parallelism configurations.
引用
下载
收藏
页码:299 / 306
页数:8
相关论文
共 50 条
  • [31] PERFORMANCE OF A PROCESS FLOWSHEETING SYSTEM ON A SUPERCOMPUTER
    HARRISON, BK
    COMPUTERS & CHEMICAL ENGINEERING, 1989, 13 (07) : 855 - 857
  • [32] About performance and intellectuality of supercomputer modeling
    V. P. Il’in
    I. N. Skopin
    Programming and Computer Software, 2016, 42 : 5 - 16
  • [33] The performance of a supercomputer built with commodity components
    Deng, YF
    Korobka, A
    PARALLEL COMPUTING, 2001, 27 (1-2) : 91 - 108
  • [34] About Performance and Intellectuality of Supercomputer Modeling
    Il'in, V. P.
    Skopin, I. N.
    PROGRAMMING AND COMPUTER SOFTWARE, 2016, 42 (01) : 5 - 16
  • [35] Predicting Output Performance of a Petascale Supercomputer
    Xie, Bing
    Huang, Yezhou
    Chase, Jeffrey S.
    Choi, Jong Youl
    Klasky, Scott
    Lofstead, Jay
    Oral, Sarp
    HPDC'17: PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE PARALLEL AND DISTRIBUTED COMPUTING, 2017, : 181 - 192
  • [36] A Review of Supercomputer Performance Monitoring Systems
    Stefanov, Konstantin S.
    Pawar, Sucheta
    Ranjan, Ashish
    Wandhekar, Sanjay
    Voevodin, Vladimir V.
    Supercomputing Frontiers and Innovations, 2021, 8 (03) : 62 - 81
  • [37] Quad PowerPCs yield supercomputer performance
    Webb, W
    EDN, 2001, 46 (04) : 26 - 26
  • [38] SUPERCOMPUTER PERFORMANCE - THE THEORY, PRACTICE, AND RESULTS
    LUBECK, OM
    ADVANCES IN COMPUTERS, 1988, 27 : 309 - 362
  • [39] GRAPHICS SYSTEM RIVALS SUPERCOMPUTER PERFORMANCE
    DONLIN, M
    COMPUTER DESIGN, 1993, 32 (06): : 90 - &
  • [40] Virtual Private Supercomputer: Design and Evaluation
    Gankevich, Ivan
    Gaiduchok, Vladimir
    Gushchanskiy, Dmitry
    Tipikin, Yuri
    Korkhov, Vladimir
    Degtyarev, Alexander
    Bogdanov, Alexander
    Zolotarev, Valeriy
    2013 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES (CSIT), 2013,