Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-scale Multithreaded BlueGene/Q Supercomputer

被引:2
|
作者
Wu, Xingfu [1 ]
Taylor, Valerie [1 ]
机构
[1] Texas A&M Univ, Dept Comp Sci & Engn, College Stn, TX 77843 USA
关键词
Performance analysis; hybrid MPI/OpenMP; multithreaded; BlueGene/Q;
D O I
10.1109/SNPD.2013.81
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded BlueGene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used.
引用
收藏
页码:303 / 309
页数:7
相关论文
共 50 条
  • [1] Performance characteristics of hybrid MPI/OpenMP scientific applications on a large-scale multithreaded BlueGene/Q supercomputer
    Wu X.
    Taylor V.
    [J]. International Journal of Networked and Distributed Computing, 2013, 1 (4) : 213 - 225
  • [2] Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-scale Multithreaded BlueGene/Q Supercomputer
    Wu, Xingfu
    Taylor, Valerie
    [J]. INTERNATIONAL JOURNAL OF NETWORKED AND DISTRIBUTED COMPUTING, 2013, 1 (04) : 213 - 225
  • [3] Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers
    Wu, Xingfu
    Taylor, Valerie
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2013, 79 (08) : 1256 - 1268
  • [4] Performance analysis of large-scale OpenMP and hybrid MPI/OpenMP applications with Vampir NG
    Brunst, Holger
    Mohr, Bernd
    [J]. OPENMP SHARED MEMORY PARALLEL PROGRAMMING, PROCEEDINGS, 2008, 4315 : 5 - +
  • [5] Performance study of multithreaded MPI and OpenMP tasking in a large scientific code
    Akhmetova, Dana
    Iakymchuk, Roman
    Ekeberg, Orjan
    Laure, Erwin
    [J]. 2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 756 - 765
  • [6] Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters
    Wu, Xingfu
    Taylor, Valerie
    [J]. COMPUTER JOURNAL, 2012, 55 (02): : 154 - 167
  • [7] Interoperability strategies for GASPI and MPI in large-scale scientific applications
    Simmendinger, Christian
    Iakymchuk, Roman
    Cebamanos, Luis
    Akhmetova, Dana
    Bartsch, Valeria
    Rotaru, Tiberiu
    Rahn, Mirko
    Laure, Erwin
    Markidis, Stefano
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (03): : 554 - 568
  • [8] Automatic performance analysis of hybrid MPI/OpenMP applications
    Wolf, F
    Mohr, B
    [J]. ELEVENTH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PROCEEDINGS, 2003, : 13 - 22
  • [9] Automatic performance analysis of hybrid MPI/OpenMP applications
    Wolf, F
    Mohr, B
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2003, 49 (10-11) : 421 - 439
  • [10] Using MPI File Caching to Improve Parallel Write Performance for Large-Scale Scientific Applications
    Liao, Wei-keng
    Ching, Avery
    Coloma, Kenin
    Nisar, Arifa
    Choudhary, Alok
    Chen, Jacqueline
    Sankaran, Ramanan
    Klasky, Scott
    [J]. 2007 ACM/IEEE SC07 CONFERENCE, 2010, : 661 - +