Diagnosing Performance Bottlenecks in Massive Data Parallel Programs

被引:5
|
作者
Dias, Vinicius [1 ]
Moreira, Rubens [1 ]
Meira, Wagner, Jr. [1 ]
Guedes, Dorgival [1 ]
机构
[1] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
关键词
D O I
10.1109/CCGrid.2016.81
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The increasing amount of data being stored and the variety of applications being proposed recently to make use of those data enabled a whole new generation of parallel programming environments and paradigms. Although most of these novel environments provide abstract programming interfaces and embed several run-time strategies that simplify several typical tasks in parallel and distributed systems, achieving good performance is still a challenge. In this paper we identify some common sources of performance degradation in the Spark programming environment and discuss some diagnosis dimensions that can be used to better understand such degradation. We then describe our experience in the use of those dimensions to drive the identification performance problems, and suggest how their impact may be minimized considering real applications.
引用
收藏
页码:273 / 276
页数:4
相关论文
共 50 条
  • [41] PATOP FOR PERFORMANCE TUNING OF PARALLEL PROGRAMS
    BEMMERL, T
    HANSEN, O
    LUDWIG, T
    LECTURE NOTES IN COMPUTER SCIENCE, 1990, 457 : 840 - 851
  • [42] A METHOD FOR PERFORMANCE PREDICTION OF PARALLEL PROGRAMS
    SOTZ, F
    LECTURE NOTES IN COMPUTER SCIENCE, 1990, 457 : 98 - 107
  • [43] Understanding the performance of parallel symbolic programs
    Halstead, RH
    PARALLEL SYMBOLIC LANGUAGES AND SYSTEMS, 1996, 1068 : 81 - 107
  • [44] Specification and performance metrics for parallel programs
    d'Auriol, BJ
    Ulloa, J
    SERP '05: Proceedings of the 2005 International Conference on Software Engineering Research and Practice, Vols 1 and 2, 2005, : 101 - 107
  • [45] Automatic performance evaluation of parallel programs
    Espinosa, A
    Margalef, T
    Luque, E
    PROCEEDINGS OF THE SIXTH EUROMICRO WORKSHOP ON PARALLEL AND DISTRIBUTED PROCESSING - PDP '98, 1998, : 43 - 49
  • [46] Locating Cache Performance Bottlenecks Using Data Profiling
    Pesterev, Aleksey
    Zeldovich, Nickolai
    Morris, Robert T.
    EUROSYS'10: PROCEEDINGS OF THE EUROSYS 2010 CONFERENCE, 2010, : 335 - 348
  • [47] A PARALLEL VOLUME RENDERING METHOD FOR MASSIVE DATA
    Yao, Jun
    Xue, Jian
    Lv, Ke
    Miao, Qinghai
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2016,
  • [48] Parallel Processing of Massive EEG Data with MapReduce
    Wang, Lizhe
    Chen, Dan
    Ranjan, Rajiv
    Khan, Samee U.
    Kolodziej, Joanna
    Wang, Jun
    PROCEEDINGS OF THE 2012 IEEE 18TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2012), 2012, : 164 - 171
  • [49] Parallel labeling of massive XML data with MapReduce
    Choi, Hyebong
    Lee, Kyong-Ha
    Lee, Yoon-Joon
    JOURNAL OF SUPERCOMPUTING, 2014, 67 (02): : 408 - 437
  • [50] Parallel labeling of massive XML data with MapReduce
    Hyebong Choi
    Kyong-Ha Lee
    Yoon-Joon Lee
    The Journal of Supercomputing, 2014, 67 : 408 - 437