Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters

被引:71
|
作者
Yang, Chao-Tung [1 ]
Huang, Chih-Lin [1 ]
Lin, Cheng-Fang [1 ]
机构
[1] Tunghai Univ, Dept Comp Sci, Taichung 40704, Taiwan
关键词
CUDA; GPU; MPI; OpenMP; Hybrid; Parallel programming;
D O I
10.1016/j.cpc.2010.06.035
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Nowadays NVIDIA s CUDA is a general purpose scalable parallel programming model for writing highly parallel applications It provides several key abstractions - a hierarchy of thread blocks shared memory and barrier synchronization This model has proven quite successful at programming multithreaded many core GPUs and scales transparently to hundreds of cores scientists throughout industry and academia are already using CUDA to achieve dramatic speedups on production and research codes In this paper we propose a parallel programming approach using hybrid CUDA OpenMP and MPI programming which partition loop iterations according to the number of C1060 CPU nodes in a CPU cluster which consists of one C1060 and one S1070 Loop iterations assigned to one MPI process are processed in parallel by CUDA run by the processor cores in the same computational node (C) 2010 Elsevier B V All rights reserved
引用
收藏
页码:266 / 269
页数:4
相关论文
共 50 条
  • [1] Parallel programming for OSEM reconstruction with MPI, OpenMP, and hybrid MPI-OpenMP
    Jones, MD
    Yao, RT
    [J]. 2004 IEEE NUCLEAR SCIENCE SYMPOSIUM CONFERENCE RECORD, VOLS 1-7, 2004, : 3036 - 3042
  • [2] An approach of performance comparisons with OpenMP and CUDA parallel programming on multicore systems
    Chang, Chih-Hung
    Lu, Chih-Wei
    Yang, Chao-Tung
    Chang, Tzu-Chieh
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (16): : 4230 - 4245
  • [3] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
    Chao-Chin Wu
    Lien-Fu Lai
    Chao-Tung Yang
    Po-Hsun Chiu
    [J]. The Journal of Supercomputing, 2012, 60 : 31 - 61
  • [4] Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters
    Wu, Chao-Chin
    Lai, Lien-Fu
    Yang, Chao-Tung
    Chiu, Po-Hsun
    [J]. JOURNAL OF SUPERCOMPUTING, 2012, 60 (01): : 31 - 61
  • [5] Performance-based parallel loop self-scheduling using hybrid OpenMP and MPI programming on multicore SMP clusters
    Yang, Chao-Tung
    Wu, Chao-Chin
    Chang, Jen-Hsiang
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (08): : 721 - 744
  • [6] Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes
    Rabenseifner, Rolf
    Hager, Georg
    Jost, Gabriele
    [J]. PROCEEDINGS OF THE PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2009, : 427 - +
  • [7] Enabling Mixed OpenMP/MPI Programming on Hybrid CPU/GPU Computing Architecture
    Liang, Tyng-Yeu
    Li, Hung-Fu
    Chiu, Jun-Yao
    [J]. 2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 2369 - 2377
  • [8] A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters
    Li, Hung-Fu
    Liang, Tyng-Yeu
    Chiu, Jun-Yao
    [J]. JOURNAL OF SUPERCOMPUTING, 2013, 66 (01): : 381 - 405
  • [9] A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters
    Hung-Fu Li
    Tyng-Yeu Liang
    Jun-Yao Chiu
    [J]. The Journal of Supercomputing, 2013, 66 : 381 - 405
  • [10] Hybrid MPI-OpenMP programming for parallel OSEM PET reconstruction
    Jones, M. D.
    Yao, R.
    Bhole, C. P.
    [J]. IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2006, 53 (05) : 2752 - 2758