Portable Mapping of Data Parallel Programs to OpenCL for Heterogeneous Systems

被引:0
|
作者
Grewe, Dominik [1 ]
Wang, Zheng [1 ]
O'Boyle, Michael F. P. [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9YL, Midlothian, Scotland
关键词
GPU; OpenCL; Machine-Learning Mapping;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
General purpose GPU based systems are highly attractive as they give potentially massive performance at little cost. Realizing such potential is challenging due to the complexity of programming. This paper presents a compiler based approach to automatically generate optimized OpenCL code from data-parallel OpenMP programs for GPUs. Such an approach brings together the benefits of a clear high level language (OpenMP) and an emerging standard (OpenCL) for heterogeneous multi-cores. A key feature of our scheme is that it leverages existing transformations, especially data transformations, to improve performance on GPU architectures and uses predictive modeling to automatically determine if it is worthwhile running the OpenCL code on the GPU or OpenMP code on the multi-core host. We applied our approach to the entire NAS parallel benchmark suite and evaluated it on two distinct GPU based systems: Core i7/NVIDIA GeForce GTX 580 and Core i7/AMD Radeon 7970. We achieved average (up to) speedups of 4.51x and 4.20x (143x and 67x) respectively over a sequential baseline. This is, on average, a factor 1.63 and 1.56 times faster than a hand-coded, GPU-specific OpenCL implementation developed by independent expert programmers.
引用
收藏
页码:161 / 170
页数:10
相关论文
共 50 条
  • [31] DEVELOPMENT OF PORTABLE PARALLEL PROGRAMS WITH LARGE-GRAIN DATA FLOW .2.
    DINUCCI, DC
    BABB, RG
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1990, 457 : 253 - 264
  • [32] A static mapping heuristics to map parallel applications to heterogeneous computing systems
    Baraglia, R
    Ferrini, R
    Ritrovato, P
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2005, 17 (13): : 1579 - 1605
  • [33] Multi-Task Scheduling Framework for OpenCL Programs on CPUsGPUs Heterogeneous Platforms
    Wang, Hao
    Wang, Haofeng
    Wang, Sufang
    [J]. THIRD INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION; NETWORK AND COMPUTER TECHNOLOGY (ECNCT 2021), 2022, 12167
  • [34] Automatic translation of data parallel programs for heterogeneous parallelism through OpenMP offloading
    Wang, Farui
    Zhang, Weizhe
    Guo, Haonan
    Hao, Meng
    Lu, Gangzhao
    Wang, Zheng
    [J]. JOURNAL OF SUPERCOMPUTING, 2021, 77 (05): : 4957 - 4987
  • [35] Automatic translation of data parallel programs for heterogeneous parallelism through OpenMP offloading
    Farui Wang
    Weizhe Zhang
    Haonan Guo
    Meng Hao
    Gangzhao Lu
    Zheng Wang
    [J]. The Journal of Supercomputing, 2021, 77 : 4957 - 4987
  • [36] CoopCL: Cooperative Execution of OpenCL Programs on Heterogeneous CPU-GPU Platforms
    Moren, Konrad
    Goehringer, Diana
    [J]. 2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020), 2020, : 224 - 231
  • [37] Vector data flow analysis for SIMD optimizations on OpenCL programs
    Lin, Yu-Te
    Lee, Jenq-Kuen
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (05): : 1629 - 1654
  • [38] Optimizing communications of data parallel programs in scalable cluster systems
    Wang, Chun-Ching
    Chen, Shih-Chang
    Hsu, Ching-Hsien
    Yang, Chao-Tung
    [J]. ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2008, 5036 : 29 - +
  • [39] Mapping Parallel Programs to Heterogeneous CPU/GPU Architectures using a Monte Carlo Tree Search
    Goli, Mehdi
    McCall, John
    Brown, Christopher
    Janjic, Vladimir
    Hammond, Kevin
    [J]. 2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2013, : 2932 - 2939
  • [40] Work Distribution of Data-Parallel Applications on Heterogeneous Systems
    Memeti, Suejb
    Pllana, Sabri
    [J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2016 INTERNATIONAL WORKSHOPS, 2016, 9945 : 69 - 81