A Performance Prediction Model for Memory-intensive GPU Kernels

被引:3
|
作者
Hu, Zhidan [1 ]
Liu, Guangming [1 ]
Hu, Zhidan [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Hunan, Peoples R China
关键词
GPU; CUDA; performance prediction; memory-intensive;
D O I
10.1109/SCAC.2014.10
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Commodity graphic processing units (GPUs) have rapidly evolved to become high performance accelerators for data-parallel computing through a large array of processing cores and the CUDA programming model with a C-like interface. However, optimizing an application for maximum performance based on the GPU architecture is not a trivial task for the tremendous change from conventional multi-core to the many-core architectures. Besides, the GPU vendors do not disclose much detail about the characteristics of the GPU's architecture. To provide insights into the performance of memory-intensive kernels, we propose a pipelined global memory model to incorporate the most critical global memory performance related factor, uncoalesced memory access pattern, and provide a basis for predicting performance of memory-intensive kernels. As we will demonstrate, the pipeline throughput is dynamic and sensitive to the memory access patterns. We validated our model on the NVIDIA GPUs using CUDA (Compute Unified Device Architecture). The experiment results show that the pipeline captures performance factors related to global memory and is able to estimate the performance for memory-intensive GPU kernels via the proposed model.
引用
收藏
页码:14 / 18
页数:5
相关论文
共 50 条
  • [31] Comparing unified, pinned, and host/device memory allocations for memory-intensive workloads on Tegra SoC
    Choi, Jake
    You, Hojun
    Kim, Chongam
    Young Yeom, Heon
    Kim, Yoonhee
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (04):
  • [32] MOMA: Mapping of Memory-intensive Software-pipelined Applications for Systems with Multiple Memory Controllers
    Jahn, Janmartin
    Pagani, Santiago
    Chen, Jian-Jia
    Henkel, Joerg
    2013 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2013, : 508 - 515
  • [33] Left parietal alpha enhancement during working memory-intensive sentence processing
    Meyer, Lars
    Obleser, Jonas
    Friederici, Angela D.
    CORTEX, 2013, 49 (03) : 711 - 721
  • [34] An Experimental Associative Capacitive Network based on Complementary Resistive Switches for Memory-intensive Computing
    Nielen, L.
    Tappertzhofen, S.
    Linn, E.
    Waser, R.
    Kavehei, O.
    2014 IEEE SILICON NANOELECTRONICS WORKSHOP (SNW), 2014,
  • [35] FAST CONVOLUTION KERNELS ON PASCAL GPU WITH HIGH MEMORY EFFICIENCY
    Chang, Qiong
    Onishi, Masaki
    Maruyama, Tsutomu
    HIGH PERFORMANCE COMPUTING SYMPOSIUM (HPC 2018), 2018, 50 (04):
  • [36] Utilizing GPU Performance Counters to Characterize GPU Kernels via Machine Learning
    Zigon, Bob
    Song, Fengguang
    COMPUTATIONAL SCIENCE - ICCS 2020, PT I, 2020, 12137 : 88 - 101
  • [37] A Statistical Performance Prediction Model for OpenCL Kernels on NVIDIA GPUs
    Karami, Ali
    Mirsoleimani, Sayyed Ali
    Khunjush, Farshad
    2013 17TH CSI INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND DIGITAL SYSTEMS (CADS 2013), 2013, : 15 - 22
  • [38] QZRAM: A Transparent Kernel Memory Compression System Design for Memory-Intensive Applications with QAT Accelerator Integration
    Gao, Chi
    Xu, Xiaofei
    Yang, Zhizou
    Lin, Liwei
    Li, Jian
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [39] SPMPool: Runtime SPM Management for Memory-Intensive Applications in Embedded Many-Cores
    Tajik, Hossein
    Donyanavard, Bryan
    Dutt, Nikil
    Jahn, Janmartin
    Henkel, Joerg
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2016, 16 (01)
  • [40] BACM: Barrier-Aware Cache Management for Irregular Memory-Intensive GPGPU Workloads
    Liu, Yuxi
    Zhao, Xia
    Yu, Zhibin
    Wang, Zhenlin
    Wang, Xiaolin
    Luo, Yingwei
    Eeckhout, Lieven
    2017 IEEE 35TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2017, : 633 - 640