Locality Protected Dynamic Cache Allocation Scheme on GPUs

被引:0
|
作者
Zhang, Yang [1 ]
Xing, Zuocheng [1 ]
Zhou, Li [2 ]
Zhu, Chunsheng [3 ]
机构
[1] Natl Univ Def Technol, Natl Lab Parallel & Distributed Proc, Changsha, Hunan, Peoples R China
[2] Natl Univ Def Technol, Sch Elect Sci & Engn, Changsha, Hunan, Peoples R China
[3] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC, Canada
关键词
PARALLELISM;
D O I
10.1109/TrustCom.2016.235
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As we are approaching the exascale era in super-computing, designing a balanced computer system with powerful computing ability and low energy consumption becomes increasingly important. GPU is a widely used accelerator in most recently applied supercomputers. It adopts massive multithreads to hide long latency and has high energy efficiency. In contrast to its strong computing power, GPUs have few on-chip resources with several MB of fast on-chip memory storage per SM (Streaming Multiprocessors). GPU caches exhibit poor efficiency due to the mismatch of the throughput-oriented execution model and its cache hierarchy design. Since the severe deficiency in on-chip memory, the benefit of high computing capacity of GPUs is pulled down by the poor cache performance dramatically, which limits system performance and energy-efficiency. In this paper, we put forward a locality protected scheme to make full use of the data locality based on the fixed capacity. We present a Locality Protected method based on instruction PC (LPP) to promote GPU performance. Firstly, we use a PC-based collector to collect the reuse information of each cache line. After getting the dynamic reuse information of the cache line, we take an intelligent cache allocation unit (ICAU) which coordinates the reuse information with LRU (Least Recently Used) replacement policy to find out the cache line with the least locality for eviction. The results show that LPP provides an up to 17.8% speedup and an average of 5.5% improvement over the baseline method.
引用
收藏
页码:1524 / 1530
页数:7
相关论文
共 50 条
  • [31] Thread scheduling for cache locality
    Philbin, J.
    Edler, J.
    Anshus, O.J.
    Douglas, C.C.
    Li, K.
    Computer architecture news, 1996, 24 (Special Issu) : 60 - 71
  • [32] Adaptive and Transparent Cache Bypassing for GPUs
    Li, Ang
    van den Braak, Gert-Jan
    Kumar, Akash
    Corporaal, Henk
    PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,
  • [33] Dynamic cache invalidation scheme for wireless mobile environments
    Alok Madhukar
    Tansel Özyer
    Reda Alhajj
    Wireless Networks, 2009, 15 : 727 - 740
  • [34] Dynamic cache invalidation scheme for wireless mobile environments
    Madhukar, Alok
    Ozyer, Tansel
    Alhajj, Reda
    WIRELESS NETWORKS, 2009, 15 (06) : 727 - 740
  • [35] A Penalty Aware Memory Allocation Scheme for Key-value Cache
    Ou, Jianqiang
    Patton, Marc
    Moore, Michael Devon
    Xu, Yuehai
    Jiang, Song
    2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 530 - 539
  • [36] DYNAMIC STORAGE-ALLOCATION SCHEME
    ILIFFE, JK
    JODEIT, JG
    COMPUTER JOURNAL, 1962, 5 (03): : 200 - &
  • [37] A Cache Allocation Scheme in 5G-Enabled Inhomogeneous ICVs
    Wang, Cong
    Chen, Chen
    Liu, Yangyang
    Fan, Kefeng
    Pei, Qingqi
    He, Ci
    Dou, Zhibin
    2020 IEEE 92ND VEHICULAR TECHNOLOGY CONFERENCE (VTC2020-FALL), 2020,
  • [38] Dynamic RACH Preamble Allocation Scheme
    Hwang, Hyun-Yong
    Oh, Sung-Min
    Lee, Changhee
    Kim, Jae Heung
    Shin, Jaesheung
    2015 INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC), 2015, : 770 - 772
  • [39] Soft error mitigation in cache memories of embedded systems by means of a protected scheme
    Zarandi, HR
    Miremadi, SG
    DEPENDABLE COMPUTING, PROCEEDINGS, 2005, 3747 : 121 - 130
  • [40] Dynamic Storage Cache Allocation in Multi-Server Architectures
    Prabhakar, Ramya
    Srikantaiah, Shekhar
    Patrick, Christina
    Kandemir, Mahmut
    PROCEEDINGS OF THE CONFERENCE ON HIGH PERFORMANCE COMPUTING NETWORKING, STORAGE AND ANALYSIS, 2009,