Diagnosing Performance Bottlenecks in Massive Data Parallel Programs

被引:5
|
作者
Dias, Vinicius [1 ]
Moreira, Rubens [1 ]
Meira, Wagner, Jr. [1 ]
Guedes, Dorgival [1 ]
机构
[1] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
关键词
D O I
10.1109/CCGrid.2016.81
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The increasing amount of data being stored and the variety of applications being proposed recently to make use of those data enabled a whole new generation of parallel programming environments and paradigms. Although most of these novel environments provide abstract programming interfaces and embed several run-time strategies that simplify several typical tasks in parallel and distributed systems, achieving good performance is still a challenge. In this paper we identify some common sources of performance degradation in the Spark programming environment and discuss some diagnosis dimensions that can be used to better understand such degradation. We then describe our experience in the use of those dimensions to drive the identification performance problems, and suggest how their impact may be minimized considering real applications.
引用
收藏
页码:273 / 276
页数:4
相关论文
共 50 条
  • [21] Predicting the performance of parallel programs
    Blanco, V
    González, JA
    León, C
    Rodríguez, C
    Rodríguez, G
    Printista, M
    PARALLEL COMPUTING, 2004, 30 (03) : 337 - 356
  • [22] THE PERFORMANCE OF PARALLEL PROLOG PROGRAMS
    FAGIN, BS
    DESPAIN, AM
    IEEE TRANSACTIONS ON COMPUTERS, 1990, 39 (12) : 1434 - 1445
  • [23] PERFORMANCE VISUALIZATION FOR PARALLEL PROGRAMS
    LUSK, E
    THEORETICA CHIMICA ACTA, 1993, 84 (4-5): : 377 - 384
  • [24] A performance estimator for parallel programs
    Reeve, J
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 193 - 202
  • [25] A data-driven approach to diagnosing throughput bottlenecks from a maintenance perspective
    Subramaniyan, Mukund
    Skoogh, Anders
    Muhammad, Azam Sheikh
    Bokrantz, Jon
    Johansson, Bjorn
    Roser, Christoph
    COMPUTERS & INDUSTRIAL ENGINEERING, 2020, 150 (150)
  • [26] Diagnosing Development Bottlenecks: China and India
    Li, Wei
    Mengistae, Taye
    Xu, Lixin Colin
    OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 2011, 73 (06) : 722 - 752
  • [27] A ROBUST PARALLEL FRAMEWORK FOR MASSIVE SPATIAL DATA PROCESSING ON HIGH PERFORMANCE CLUSTERS
    Guan, Xuefeng
    XXII ISPRS CONGRESS, TECHNICAL COMMISSION IV, 2012, 39-B4 : 213 - 217
  • [28] Fault tolerant algorithm for functional and data flow parallel programs performance on clusters
    Bazhanov, S. E.
    Bogdanets, S. V.
    Kutepov, V. P.
    DEPCOS - RELCOMEX 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DEPENDABILITY OF COMPUTER SYSTEMS, 2008, : 87 - +
  • [30] Diagnosing Highly-Parallel OpenMP Programs with Aggregated Grain Graphs
    Reissmann, Nico
    Muddukrishna, Ananya
    EURO-PAR 2018: PARALLEL PROCESSING, 2018, 11014 : 106 - 119