Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems

被引:24
|
作者
Wang, Zheng [1 ]
Grewe, Dominik [2 ]
O'Boyle, Michael F. P. [2 ]
机构
[1] Univ Lancaster, Sch Comp & Commun, Lancaster LA1 4YW, England
[2] Univ Edinburgh, Edinburgh EH8 9YL, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
Experimentation; Languages; Measurement; Performance; GPU; OpenCL; Machine-learning mapping;
D O I
10.1145/2677036
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
General-purpose GPU-based systems are highly attractive, as they give potentially massive performance at little cost. Realizing such potential is challenging due to the complexity of programming. This article presents a compiler-based approach to automatically generate optimized OpenCL code from data parallel OpenMP programs for GPUs. A key feature of our scheme is that it leverages existing transformations, especially data transformations, to improve performance on GPU architectures and uses automatic machine learning to build a predictive model to determine if it is worthwhile running the OpenCL code on the GPU or OpenMP code on themulticore host. We applied our approach to the entire NAS parallel benchmark suite and evaluated it on distinct GPU-based systems. We achieved average (up to) speedups of 4.51x and 4.20x (143x and 67x) on Core i7/NVIDIA GeForce GTX580 and Core i7/AMD Radeon 7970 platforms, respectively, over a sequential baseline. Our approach achieves, on average, greater than 10x speedups over two state-of-the-art automatic GPU code generators.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Portable Mapping of Data Parallel Programs to OpenCL for Heterogeneous Systems
    Grewe, Dominik
    Wang, Zheng
    O'Boyle, Michael F. P.
    [J]. PROCEEDINGS OF THE 2013 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2013, : 161 - 170
  • [2] Automatic Mapping for OpenCL-Programs on CPU/GPU Heterogeneous Platforms
    Moren, Konrad
    Goehringer, Diana
    [J]. COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 301 - 314
  • [3] Portable Parallel Programs with Python']Python and OpenCL
    Di Pierro, Massimo
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2014, 16 (01) : 34 - 40
  • [4] GPU-Based Large Seismic Data Parallel Compression
    Xie, Kai
    Yu, H. Q.
    Lu, G. Y.
    [J]. INTELLIGENCE COMPUTATION AND EVOLUTIONARY COMPUTATION, 2013, 180 : 339 - 345
  • [5] Parallel GPU-based data-dependent triangulations
    Cervenansky, Michal
    Toth, Zsolt
    Starinsky, Juraj
    Ferko, Andrej
    Sramek, Milos
    [J]. COMPUTERS & GRAPHICS-UK, 2010, 34 (02): : 125 - 135
  • [6] GPU-based parallel algorithms for sparse nonlinear systems
    Galiano, V.
    Migallon, H.
    Migallon, V.
    Penades, J.
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (09) : 1098 - 1105
  • [7] A Portable OpenCL-based Approach for SVMs in GPU
    Cagnini, Henry E. L.
    Winck, Ana T.
    Barros, Rodrigo C.
    [J]. 2015 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2015), 2015, : 198 - 203
  • [8] GPU-Based Soil Parameter Parallel Inversion for PolSAR Data
    Yin, Qiang
    Wu, You
    Zhang, Fan
    Zhou, Yongsheng
    [J]. REMOTE SENSING, 2020, 12 (03)
  • [9] Toward performance-portable PETSc for GPU-based exascale systems
    Mills, Richard Tran
    Adams, Mark F.
    Balay, Satish
    Brown, Jed
    Dener, Alp
    Knepley, Matthew
    Kruger, Scott E.
    Morgan, Hannah
    Munson, Todd
    Rupp, Karl
    Smith, Barry F.
    Zampini, Stefano
    Zhang, Hong
    Zhang, Junchao
    [J]. PARALLEL COMPUTING, 2021, 108
  • [10] GPU-Based Parallel Reservoir Simulators
    Chen, Zhangxin
    Liu, Hui
    Yu, Song
    Hsieh, Ben
    Shao, Lei
    [J]. DOMAIN DECOMPOSITION METHODS IN SCIENCE AND ENGINEERING XXI, 2014, 98 : 199 - 206