Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-scale Multithreaded BlueGene/Q Supercomputer

被引:2
|
作者
Wu, Xingfu [1 ]
Taylor, Valerie [1 ]
机构
[1] Texas A&M Univ, Dept Comp Sci & Engn, College Stn, TX 77843 USA
关键词
Performance analysis; hybrid MPI/OpenMP; multithreaded; BlueGene/Q;
D O I
10.1109/SNPD.2013.81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded BlueGene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used.
引用
收藏
页码:303 / 309
页数:7
相关论文
共 50 条
  • [21] An efficient large-scale mesh deformation method based on MPI/OpenMP hybrid parallel radial basis function interpolation
    Zhao, Zhong
    Ma, Rong
    He, Lei
    Chang, Xinghua
    Zhang, Laiping
    [J]. CHINESE JOURNAL OF AERONAUTICS, 2020, 33 (05) : 1392 - 1404
  • [22] Power-aware predictive models of hybrid (MPI/OpenMP) scientific applications on multicore systems
    Lively, Charles
    Wu, Xingfu
    Taylor, Valerie
    Moore, Shirley
    Chang, Hung-Ching
    Su, Chun-Yi
    Cameron, Kirk
    [J]. COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2012, 27 (04): : 245 - 253
  • [23] Communication characteristics of large-scale scientific applications for contemporary cluster architectures
    Vetter, JS
    Mueller, F
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2003, 63 (09) : 853 - 865
  • [24] Performance Analysis of Work Stealing in Large-scale Multithreaded Computing
    Sonenberg, Nikki
    Kielanski, Grzegorz
    Van Houdt, Benny
    [J]. ACM TRANSACTIONS ON MODELING AND PERFORMANCE EVALUATION OF COMPUTING SYSTEMS, 2021, 6 (02)
  • [25] LARGE-SCALE APPLICATIONS OF TRANSPUTERS IN HEP - THE EDINBURGH CONCURRENT SUPERCOMPUTER PROJECT
    BOOTH, SP
    BOWLER, KC
    CANDLIN, DJ
    KENWAY, RD
    PENDLETON, BJ
    THORNTON, AM
    WALLACE, DJ
    BLAIRFISH, J
    ROWETH, D
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 1989, 57 (1-3) : 101 - 107
  • [26] A methodology for scientific benchmarking with large-scale applications
    Armstrong, B
    Eigenmann, R
    [J]. PERFORMANCE EVALUATION AND BENCHMARKING WITH REALISTIC APPLICATIONS, 2001, : 109 - 127
  • [27] Performance Analysis of Work Stealing Strategies in Large-Scale Multithreaded Computing
    Kielanski, Grzegorz
    Van Houdt, Benny
    [J]. ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2023, 33 (04):
  • [28] Software testing and evaluation in large-scale scientific applications
    Mu, M
    [J]. QUALITY OF NUMERICAL SOFTWARE - ASSESSMENT AND ENHANCEMENT, 1997, : 330 - 332
  • [29] Energy Modeling of Supercomputers and Large-Scale Scientific Applications
    Pakin, Scott
    Lang, Michael
    [J]. 2013 INTERNATIONAL GREEN COMPUTING CONFERENCE (IGCC), 2013,
  • [30] Web portal to make large-scale scientific computations based on Grid computing and MPI
    Akzhalova, Assel Zh.
    Aizhulov, Daniar Y.
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2008, 4967 : 888 - 893