Performance analysis of four parallel programming models on NUMA architectures

被引:0
|
作者
Mohamed, AS [1 ]
Cantonnet, F [1 ]
机构
[1] George Washington Univ, Dept Elect & Comp Engn, Washington, DC 20052 USA
关键词
NAS; OpenMP; MPI; SHMEM; and DSM;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient parallel implementation of the NAS NPB benchmark is a challenging task. In this paper, We compare the performance of and the programming effort required for coding the NAS NPB benchmark under four leading parallel programming models: MPI, OpenMP, SHMEM, and DSM on an SGI NUMA Origin 3800 system, a machine which supports all four models efficiently. We make use of the spectrum of performance analysis and profiler tools within the SGI NUMA environment to monitor various low-level physical parameters to analyze the efficiency and performance of each of the programming models. Our objective is to be able to compare the physically monitored parameters across the four programming models. Using this visualized information, we will be able to better understand the communication, data/threads layouts, and I/O bottlenecks in these parallel programming models. Results indicate that the four models deliver comparable performance; however, the implementations differ significantly beyond merely using explicit messages versus implicit loads/stores even though the basic parallel algorithms are similar. Compared with the message-passing (using MPI) and SHMEM programming models, the cache-coherent distributed shared address space DSM-UPC and shared OpenMP models provide substantial ease of programming at both the conceptual and program orchestration levels, often accompanied by performance gains. However, DSM-UPC currently has portability limitations and may suffer from poor spatial locality of physically distributed shared data on large numbers of processors.
引用
收藏
页码:119 / 125
页数:7
相关论文
共 50 条
  • [41] Comparative analysis of parallel programming language complexity and performance
    Univ of Minnesota, Minneapolis, United States
    [J]. Concurrency Pract Exper, 10 (807-820):
  • [42] A comparative analysis of parallel programming language complexity and performance
    Vanderwiel, SP
    Nathanson, D
    Lilja, DJ
    [J]. CONCURRENCY-PRACTICE AND EXPERIENCE, 1998, 10 (10): : 807 - 820
  • [43] Parallel programming: Models, methods and programming languages
    Hammond, K
    [J]. EURO-PAR 2002 PARALLEL PROCESSING, PROCEEDINGS, 2002, 2400 : 603 - 604
  • [44] Parallel Programming Tools for Multi-core Architectures
    Mohr, Bernd
    Krammer, Bettina
    Mix, Hartmut
    [J]. PARALLEL COMPUTING: FROM MULTICORES AND GPU'S TO PETASCALE, 2010, 19 : 643 - 652
  • [45] A Parallel Programming Framework Orchestrating Multiple Languages and Architectures
    Murase, Masana
    Maeda, Kumiko
    Doi, Munehiro
    Komatsu, Hideaki
    Noda, Shigeho
    Himeno, Ryutaro
    [J]. PROCEEDINGS OF THE 2011 8TH ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF 2011), 2011,
  • [46] Introduction to the special section on parallel architectures, algorithms and programming
    Sang, Yingpeng
    Tian, Hui
    Park, James J.
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2016, 50 : 125 - 126
  • [47] Genetic programming to design communication algorithms for parallel architectures
    Comellas, F.
    Gimenez, G.
    [J]. Parallel Processing Letters, 1998, 8 (04): : 549 - 560
  • [48] COMPARING 2 PARALLEL LOGIC-PROGRAMMING ARCHITECTURES
    TICK, E
    [J]. IEEE SOFTWARE, 1989, 6 (04) : 71 - 80
  • [49] Assessing the performance portability of modern parallel programming models using TeaLeaf
    Martineau, Matthew
    McIntosh-Smith, Simon
    Gaudin, Wayne
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (15):
  • [50] Parallel Programming Models in High-Performance Cloud (ParaMo 2019)
    Oh, Sangyoon
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (18):