Unified Schemes for Directive-Based GPU Offloading

被引:0
|
作者
Miki, Yohei [1 ]
Hanawa, Toshihiro [1 ]
机构
[1] Univ Tokyo, Informat Technol Ctr, Chiba 2770882, Japan
来源
IEEE ACCESS | 2024年 / 12卷
基金
日本学术振兴会;
关键词
Graphics processing units; Codes; Kernel; Costs; Multicore processing; Switches; Supercomputers; Programming; Libraries; User interfaces; Directive; GPU; OpenACC; OpenMP target; preprocessor macro; vendor lock-in;
D O I
10.1109/ACCESS.2024.3509380
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes originally developed for multicore CPUs. Although OpenACC and OpenMP target provide similar features, both methods have pros and cons. OpenACC has better functions and an abundance of documents, but it is virtually for NVIDIA GPUs. OpenMP target supports NVIDIA/AMD/Intel GPUs but has fewer functions than OpenACC. Here, we have developed a header-only library, Solomon (Simple Off-LOading Macros Orchestrating multiple Notations), to unify the interface for GPU offloading with the support of both OpenACC and OpenMP target. Solomon provides three types of notations to reduce users' implementation and learning costs: intuitive notation for beginners and OpenACC/OpenMP-like notations for experienced developers. This manuscript denotes Solomon's implementation and usage and demonstrates the GPU-offloading in N-body simulation and the three-dimensional diffusion equation. The library and sample codes are provided as open-source software and publicly and freely available at https://github.com/ymikirepo/solomon.
引用
收藏
页码:181644 / 181665
页数:22
相关论文
共 50 条
  • [11] Programming for GPUs: the Directive-Based Approach
    Grillo, Lucas
    de Sande, Francisco
    Fumero, Juan J.
    Reyes, Ruyman
    2013 EIGHTH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC 2013), 2013, : 612 - 617
  • [12] Evaluation of Directive-Based GPU Programming Models on a Block Eigensolver with Consideration of Large Sparse Matrices
    Rabbi, Fazlay
    Daley, Christopher S.
    Aktulga, Hasan Metin
    Wright, Nicholas J.
    ACCELERATOR PROGRAMMING USING DIRECTIVES, WACCPD 2019, 2020, 12017 : 66 - 88
  • [13] Directive-Based Pipelining Extension for OpenMP
    Cui, Xuewen
    Scogland, Thomas R. W.
    de Supinski, Bronis R.
    Feng, Wu-chun
    2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 481 - 484
  • [14] A data-centric directive-based framework to accelerate out-of-core stencil computation on a GPU
    Shen, Jingcheng
    Ino, Fumihiko
    Farrés, Albert
    Hanzich, Mauricio
    IEICE Transactions on Information and Systems, 2020, E103D (12) : 2421 - 2434
  • [15] Challenges Porting a C plus plus Template-Metaprogramming Abstraction Layer to Directive-Based Offloading
    Kelling, Jeffrey
    Bastrakov, Sergei
    Debus, Alexander
    Kluge, Thomas
    Leinhauser, Matt
    Pausch, Richard
    Steiniger, Klaus
    Stephan, Jan
    Widera, Rene
    Young, Jeff
    Bussmann, Michael
    Chandrasekaran, Sunita
    Juckeland, Guido
    ACCELERATOR PROGRAMMING USING DIRECTIVES, WACCPD 2021, 2022, 13194 : 92 - 111
  • [16] A Data-Centric Directive-Based Framework to Accelerate Out-of-Core Stencil Computation on a GPU
    Shen, Jingcheng
    Ino, Fumihiko
    Farres, Albert
    Hanzich, Mauricio
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (12): : 2421 - 2434
  • [17] OpenGR: A directive-based grid programming environment
    Hirano, M
    Sato, M
    Tanaka, Y
    HIGH PERFORMANCE COMPUTING, 2003, 2858 : 552 - 563
  • [18] Directive-based Programming for GPUs: A Comparative Study
    Reyes, Ruyman
    Lopez, Ivan
    Fumero, Juan J.
    de Sande, Francisco
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 410 - 417
  • [19] OpenGR: A directive-based grid programming environment
    Hirano, M
    Sato, M
    Tanaka, Y
    PARALLEL COMPUTING, 2005, 31 (10-12) : 1140 - 1154
  • [20] A Compiler translate Directive-based Language to Optimized CUDA
    Li, Feng
    An, Hong
    Liang, Weihao
    Li, Xiaoqiang
    Cheng, Yichao
    Jiang, Xia
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 982 - 989