A tool for top-down performance analysis of GPU-accelerated applications

被引:6
|
作者
Zhou, Keren [1 ]
Krentel, Mark [1 ]
Mellor-Crummey, John [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
关键词
GPU; Profiler; Wait-free data structure; Calling context tree; HPC; Roofline;
D O I
10.1145/3332466.3374534
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To support performance measurement and analysis of GPU-accelerated applications, we extended the HPCToolkit performance tools with several novel features. To support efficient monitoring of accelerated applications, HPCToolkit employs a new wait-free data structure to coordinate measurement and attribution between each application thread and a GPU monitor thread. To help developers understand the performance of accelerated applications, HPCToolkit attributes metrics to heterogeneous calling contexts that span both CPUs and GPUs. To support fine-grain analysis and tuning of GPU-accelerated code, HPCToolkit collects PC samples of both CPU and GPU activity to derive and attribute metrics at all levels in a heterogeneous calling context.
引用
收藏
页码:415 / 416
页数:2
相关论文
共 50 条
  • [21] AN OPEN-SOURCE GPU-ACCELERATED FEATURE EXTRACTION TOOL
    Michalek, Josef
    Vanek, Jan
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 450 - 454
  • [22] GPU-accelerated differential dependency network analysis
    Speyer, Gil
    Rodriguez, Juan J.
    Bencomo, Tomas
    Kim, Seungchan
    2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 410 - 414
  • [23] A GPU-ACCELERATED COMPUTATIONAL TOOL FOR ASTEROID DISRUPTION MODELING AND SIMULATION
    Zimmerman, Ben J.
    Wie, Bong
    ASTRODYNAMICS 2015, 2016, 156 : 3367 - 3381
  • [24] A top-down technique as an analysis tool for Auger Fluorescence Data
    Guérard, CK
    Bohacova, M
    Perrone, L
    CONTRIBUTIONS TO THE 28TH INTERNATIONAL COSMIC RAY CONFERENCE, 2003, 6890 : 73 - 76
  • [25] Autotuning GPU-accelerated QAP Solvers for Power and Performance
    Chaparala, Abhilash
    Novoa, Clara
    Qasem, Apan
    2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 78 - 83
  • [26] GPU-Accelerated Molecular Dynamics: Energy Consumption and Performance
    Vecher, Vyacheslav
    Nikolskii, Vsevolod
    Stegailov, Vladimir
    SUPERCOMPUTING, RUSCDAYS 2016, 2016, 687 : 78 - 90
  • [27] A Top-Down Method for Performance Analysis and Counters Architecture
    Yasin, Ahmad
    2014 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2014, : 35 - 44
  • [28] Power and Performance of GPU-accelerated Systems: A Closer Look
    Abe, Yuki
    Sasaki, Hiroshi
    Kato, Shinpei
    Inoue, Koji
    Edahiro, Masato
    Peres, Martin
    2013 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC 2013), 2013, : 109 - +
  • [29] Power and Performance Characterization and Modeling of GPU-Accelerated Systems
    Abe, Yuki
    Inoue, Koji
    Sasaki, Hiroshi
    Edahiro, Masato
    Kato, Shinpei
    Peres, Martin
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [30] Genomics-GPU: A Benchmark Suite for GPU-accelerated Genome Analysis
    Liu, Zhuren
    Zhang, Shouzhe
    Garrigus, Justin
    Zhao, Hui
    2023 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, ISPASS, 2023, : 178 - 188