Measurement and Analysis of GPU-Accelerated OpenCL Computations on Intel GPUs

被引:2
|
作者
Cherian, Aaron Thomas [1 ]
Zhou, Keren [1 ]
Grubisic, Dejan [1 ]
Meng, Xiaozhu [1 ]
Mellor-Crummey, John [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
关键词
Supercomputers; High performance computing; Performance analysis; Parallel programming;
D O I
10.1109/ProTools54808.2021.00009
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Graphics Processing Units (GPUs) have become a key technology for accelerating node performance in supercomputers, including the US Department of Energy's forthcoming exascale systems. Since the execution model for GPUs differs from that for conventional processors, applications need to be rewritten to exploit GPU parallelism. Performance tools are needed for such GPU-accelerated systems to help developers assess how well applications offload computation onto GPUs. In this paper, we describe extensions to Rice University's HPCToolkit performance tools that support measurement and analysis of Intel's DPC++ programming model for GPU-accelerated systems atop an implementation of the industry-standard OpenCL framework for heterogeneous parallelism on Intel GPUs. HPCToolkit supports three techniques for performance analysis of programs atop OpenCL on Intel GPUs. First, HPCToolkit supports profiling and tracing of OpenCL kernels. Second, HPCToolkit supports CPU-GPU blame shifting for OpenCL kernel executions-a profiling technique that can identify code that executes on one or more CPUs while GPUs are idle. Third, HPCToolkit supports fine-grained measurement, analysis, and attribution of performance metrics to OpenCL GPU kernels, including instruction counts, execution latency, and SIMD waste. The paper describes these capabilities and then illustrates their application in case studies with two applications that offload computations onto Intel GPUs.
引用
收藏
页码:26 / 35
页数:10
相关论文
共 50 条
  • [21] GPU-accelerated Path-based Timing Analysis
    Guo, Guannan
    Huang, Tsung-Wei
    Lin, Yibo
    Wong, Martin
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 721 - 726
  • [22] GPU-Accelerated Dynamic Graph Coloring
    Yang, Ying
    Gu, Yu
    Li, Chuanwen
    Wan, Changyi
    Yu, Ge
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 296 - 299
  • [23] Toward GPU-accelerated Database Optimization
    Meister, Andreas
    Breß, Sebastian
    Saake, Gunter
    Datenbank-Spektrum, 2015, 15 (02) : 131 - 140
  • [24] GPU-accelerated eXtended Classifier System
    Abedini, Mani
    Kirley, Michael
    Chiong, Raymond
    Weise, Thomas
    2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2013, : 293 - 300
  • [25] GPU-Accelerated Flexible Molecular Docking
    Fan, Mengran
    Wang, Jian
    Jiang, Huaipan
    Feng, Yilin
    Mahdavi, Mehrdad
    Madduri, Kamesh
    Kandemir, Mahmut T.
    Dokholyan, Nikolay, V
    JOURNAL OF PHYSICAL CHEMISTRY B, 2021, 125 (04): : 1049 - 1060
  • [26] PacketShader: A GPU-Accelerated Software Router
    Han, Sangjin
    Jang, Keon
    Park, KyoungSoo
    Moon, Sue
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) : 195 - 206
  • [27] GPU-Accelerated Decoding of Integer Lists
    Mallia, Antonio
    Siedlaczek, Michal
    Suel, Torsten
    Zahran, Mohamed
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 2193 - 2196
  • [28] GPU-accelerated connectome discovery at scale
    Sreenivasan, Varsha
    Kumar, Sawan
    Pestilli, Franco
    Talukdar, Partha
    Sridharan, Devarajan
    NATURE COMPUTATIONAL SCIENCE, 2022, 2 (05): : 298 - +
  • [29] GPU-Accelerated Key Frame Analysis for Face Detection in Video
    Qi, Xuan
    Liu, Chen
    2015 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2015, : 600 - 605
  • [30] GPU-Accelerated Parameter Selection for Neural Connectivity Analysis Devices
    O'Leary, Gerard
    Taras, Ian
    Stuart, Dylan Malone
    Koerner, Jamie
    Groppe, David M.
    Valiante, Taufik A.
    Genov, Roman
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 543 - 546