CPU-Assisted GPGPU on Fused CPU-GPU Architectures

被引:0
|
作者
Yang, Yi [1 ]
Xiang, Ping [1 ]
Mantor, Mike [2 ]
Zhou, Huiyang [1 ]
机构
[1] North Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA
[2] Adv Micro Devices Inc, Graph Prod Grp, Sunnyvale, CA USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a novel approach to utilize the CPU resource to facilitate the execution of GPGPU programs on fused CPU-GPU architectures. In our model of fused architectures, the GPU and the CPU are integrated on the same die and share the on-chip L3 cache and off-chip memory, similar to the latest Intel Sandy Bridge and AMD accelerated processing unit (APU) platforms. In our proposed CPU-assisted GPGPU, after the CPU launches a GPU program, it executes a pre-execution program, which is generated automatically from the GPU kernel using our proposed compiler algorithms and contains memory access instructions of the GPU kernel for multiple thread-blocks. The CPU pre-execution program runs ahead of GPU threads because (1) the CPU pre-execution thread only contains memory fetch instructions from GPU kernels and not floating-point computations, and (2) the CPU runs at higher frequencies and exploits higher degrees of instruction-level parallelism than GPU scalar cores. We also leverage the prefetcher at the L2-cache on the CPU side to increase the memory traffic from CPU. As a result, the memory accesses of GPU threads hit in the L3 cache and their latency can be drastically reduced. Since our pre-execution is directly controlled by user-level applications, it enjoys both high accuracy and flexibility. Our experiments on a set of benchmarks show that our proposed pre-execution improves the performance by up to 113% and 21.4% on average.
引用
收藏
页码:103 / 114
页数:12
相关论文
共 50 条
  • [1] Reducing CPU-GPU Interferences to Improve CPU Performance in Heterogeneous Architectures
    Wen, Hao
    Zhang, Wei
    [J]. Journal of Computing Science and Engineering, 2020, 16 (04) : 131 - 145
  • [2] Denial of Service in CPU-GPU Heterogeneous Architectures
    Wen, Hao
    Zhang, Wei
    [J]. 2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [3] Co-Scheduling on Fused CPU-GPU Architectures With Shared Last Level Caches
    Damschen, Marvin
    Mueller, Frank
    Henkel, Joerg
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) : 2337 - 2347
  • [4] Hardware Support for Concurrent Detection of Multiple Concurrency Bugs on Fused CPU-GPU Architectures
    Zhang, Weihua
    Yu, Shiqiang
    Wang, Haojun
    Dai, Zhuofang
    Chen, Haibo
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (10) : 3083 - 3095
  • [5] A Sample-Based Dynamic CPU and GPU LLC Bypassing Method for Heterogeneous CPU-GPU Architectures
    Wang, Xin
    Zhang, Wei
    [J]. 2017 16TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS / 11TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING / 14TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2017, : 753 - 760
  • [6] CPU-Assisted GPU Thread Pool Model for Dynamic Task Parallelism
    Zhang, Shuai
    Li, Tao
    Dong, Qiankun
    Liu, Xuechen
    Yang, Yulu
    [J]. PROCEEDINGS OF THE 2015 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), 2015, : 135 - 140
  • [7] REDEFINING THE ROLE OF THE CPU IN THE ERA OF CPU-GPU INTEGRATION
    Arora, Manish
    Nath, Siddhartha
    Mazumdar, Subhra
    Baden, Scott B.
    Tullsen, Dean M.
    [J]. IEEE MICRO, 2012, 32 (06) : 4 - 16
  • [8] A comparison of Algebraic Multigrid Bidomain solvers on hybrid CPU-GPU architectures
    Centofanti, Edoardo
    Scacchi, Simone
    [J]. COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2024, 423
  • [9] Speeding up Planning in Multiagent Settings Using CPU-GPU Architectures
    Adoe, Fadel
    Chen, Yingke
    Doshi, Prashant
    [J]. AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2015, 2015, 9494 : 262 - 283
  • [10] iMLBench: A Machine Learning Benchmark Suite for CPU-GPU Integrated Architectures
    Zhang, Chenyang
    Zhang, Feng
    Guo, Xiaoguang
    He, Bingsheng
    Zhang, Xiao
    Du, Xiaoyong
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (07) : 1740 - 1752