A tool for top-down performance analysis of GPU-accelerated applications

被引:6
|
作者
Zhou, Keren [1 ]
Krentel, Mark [1 ]
Mellor-Crummey, John [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
关键词
GPU; Profiler; Wait-free data structure; Calling context tree; HPC; Roofline;
D O I
10.1145/3332466.3374534
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To support performance measurement and analysis of GPU-accelerated applications, we extended the HPCToolkit performance tools with several novel features. To support efficient monitoring of accelerated applications, HPCToolkit employs a new wait-free data structure to coordinate measurement and attribution between each application thread and a GPU monitor thread. To help developers understand the performance of accelerated applications, HPCToolkit attributes metrics to heterogeneous calling contexts that span both CPUs and GPUs. To support fine-grain analysis and tuning of GPU-accelerated code, HPCToolkit collects PC samples of both CPU and GPU activity to derive and attribute metrics at all levels in a heterogeneous calling context.
引用
收藏
页码:415 / 416
页数:2
相关论文
共 50 条
  • [1] A Tool for Performance Analysis of GPU-Accelerated Applications
    Zhou, Keren
    Mellor-Crummey, John
    PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO '19), 2019, : 282 - 282
  • [2] A Tool for Bottleneck Analysis and Performance Prediction for GPU-accelerated Applications
    Madougou, Souley
    Varbanescu, Ana Lucia
    de Laat, Cees
    van Nieuwpoort, Rob
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 641 - 652
  • [3] PERCEPTRON: an open-source GPU-accelerated proteoform identification pipeline for top-down proteomics
    Khalid, Muhammad Farhan
    Iman, Kanzal
    Ghafoor, Amna
    Saboor, Mujtaba
    Ali, Ahsan
    Muaz, Urwa
    Basharat, Abdul Rehman
    Tahir, Taha
    Abubakar, Muhammad
    Akhter, Momina Amer
    Nabi, Waqar
    Vanderbauwhede, Wim
    Ahmad, Fayyaz
    Wajid, Bilal
    Chaudhary, Safee Ullah
    NUCLEIC ACIDS RESEARCH, 2021, 49 (W1) : W510 - W515
  • [4] A Performance Model for GPU-Accelerated FDTD Applications
    Baumeister, Paul F.
    Hater, Thorsten
    Kraus, Jiri
    Pleiter, Dirk
    Wahl, Pierre
    2015 IEEE 22ND INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2015, : 185 - 193
  • [5] An Automated Tool for Analysis and Tuning of GPU-Accelerated Code in HPC Applications
    Zhou, Keren
    Meng, Xiaozhu
    Sai, Ryuichi
    Grubisic, Dejan
    Mellor-Crummey, John
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (04) : 854 - 865
  • [6] HPCVIEW: A Tool for Top-down Analysis of Node Performance
    John Mellor-Crummey
    Robert J. Fowler
    Gabriel Marin
    Nathan Tallent
    The Journal of Supercomputing, 2002, 23 : 81 - 104
  • [7] HPCVIEW: A tool for top-down analysis of node performance
    Mellor-Crummey, J
    Fowler, RJ
    Marin, G
    Tallent, N
    JOURNAL OF SUPERCOMPUTING, 2002, 23 (01): : 81 - 104
  • [8] Measurement and analysis of GPU-accelerated applications with HPCToolkit
    Zhou, Keren
    Adhianto, Laksono
    Anderson, Jonathon
    Cherian, Aaron
    Grubisic, Dejan
    Krentel, Mark
    Liu, Yumeng
    Meng, Xiaozhu
    Mellor-Crummey, John
    PARALLEL COMPUTING, 2021, 108
  • [9] Estimating the WCET of GPU-Accelerated Applications using Hybrid Analysis
    Betts, Adam
    Donaldson, Alastair
    PROCEEDINGS OF THE 2013 25TH EUROMICRO CONFERENCE ON REAL-TIME SYSTEMS (ECRTS 2013), 2013, : 193 - 202
  • [10] GPU-accelerated string matching for database applications
    Evangelia A. Sitaridi
    Kenneth A. Ross
    The VLDB Journal, 2016, 25 : 719 - 740