A Fine-grained Performance Model for GPU Architectures

被引:0
|
作者
Bombieri, Nicola [1 ]
Busato, Federico [1 ]
Fummi, Franco [1 ]
机构
[1] Univ Verona, Dept Comp Sci, I-37100 Verona, Italy
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The increasing programmability, performance, and cost/effectiveness of GPUs have led to a widespread use of such many-core architectures to accelerate general purpose applications. Nevertheless, tuning applications to efficiently exploit the GPU potentiality is a very challenging task, especially for inexperienced programmers. This is due to the difficulty of developing a SW application for the specific GPU architectural configuration, which includes managing the memory hierarchy and optimizing the execution of thousands of concurrent threads while maintaining the semantic correctness of the application. Even though several profiling tools exist, which provide programmers with a large number of metrics and measurements, it is often difficult to interpret such information for effectively tuning the application. This paper presents a performance model that allows accurately estimating the potential performance of the application under tuning on a given GPU device and, at the same time, it provides programmers with interpretable profiling hints. The paper shows the results obtained by applying the proposed model for profiling commonly used primitives and real codes.
引用
收藏
页码:1267 / 1272
页数:6
相关论文
共 50 条
  • [1] FineQuery: Fine-Grained Query Processing on CPU-GPU Integrated Architectures
    Wang, Dalin
    Zhang, Feng
    Wan, Weitao
    Li, Hourun
    Du, Xiaoyong
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2021), 2021, : 355 - 365
  • [2] A fine-grained Ethernet performance model
    Schneidewind, NF
    [J]. TELECOMMUNICATION SYSTEMS, 1996, 6 (01) : 77 - 90
  • [3] A Fine-Grained Performance Model of Cloud Computing Centers
    Khazaei, Hamzeh
    Misic, Jelena
    Misic, Vojislav B.
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (11) : 2138 - 2147
  • [4] Accelerating RSA with Fine-Grained Parallelism Using GPU
    Yang, Yang
    Guan, Zhi
    Sun, Huiping
    Chen, Zhong
    [J]. INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2015, 2015, 9065 : 454 - 468
  • [5] The case for fine-grained re-configurable architectures: An analysis of conceived performance
    Valtonen, T
    Isoaho, J
    Tenhunen, H
    [J]. FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS: RECONFIGURABLE COMPUTING IS GOING MAINSTREAM, 2002, 2438 : 816 - 825
  • [6] A Framework for Fine-Grained Synchronization of Dependent GPU Kernels
    Jangda, Abhinav
    Maleki, Saeed
    Dehnavi, Maryam Mehri
    Musuvathi, Madan
    Saarikivi, Olli
    [J]. 2024 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, CGO, 2024, : 93 - 105
  • [7] FineStream: Fine-Grained Window-Based Stream Processing on CPU-GPU Integrated Architectures
    Zhang, Feng
    Yang, Lin
    Zhang, Shuhao
    He, Bingsheng
    Lu, Wei
    Du, Xiaoyong
    [J]. PROCEEDINGS OF THE 2020 USENIX ANNUAL TECHNICAL CONFERENCE, 2020, : 633 - 647
  • [8] Fine-Grained Scheduling in Heterogeneous-ISA Architectures
    Boran, Nirmal Kumar
    Rathore, Shubhankit
    Udeshi, Meet
    Singh, Virendra
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2021, 20 (01) : 9 - 12
  • [9] Strengthening Component Architectures by Modeling Fine-grained Entities
    Bures, Tomas
    Jezek, Pavel
    Malohlava, Michal
    Poch, Tomas
    Sery, Ondrej
    [J]. 2011 37TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2011), 2011, : 124 - 128
  • [10] Neural Architectures for Fine-grained Entity Type Classification
    Shimaoka, Sonse
    Stenetorp, Pontus
    Inui, Kentaro
    Riedel, Sebastian
    [J]. 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 1271 - 1280