Locality-aware data replication in the last-level cache for large scale multicores

被引:0
|
作者
Farrukh Hijaz
Qingchuan Shi
George Kurian
Srinivas Devadas
Omer Khan
机构
[1] University of Connecticut,
[2] Massachusetts Institute of Technology,undefined
[3] Google,undefined
来源
关键词
Multicore; Cache hierarchy; Data management; Energy efficiency;
D O I
暂无
中图分类号
学科分类号
摘要
Next generation large single-chip multicores will process massive data with varying degree of locality. Harnessing on-chip data locality to optimize the utilization of on-chip cache and network resources is of fundamental importance. We propose a locality-aware selective data replication protocol for the last-level cache (LLC). The goal is to lower memory access latency and energy by only replicating cache lines with high reuse in the LLC slice of the requesting core, while simultaneously keep the off-chip miss rate low. The approach relies on low-overhead yet highly accurate in-hardware runtime cache line level classifier that only allows replication of cache lines with high reuse. Furthermore, a classifier captures the LLC pressure at the existing replica locations and adapts its replication decision accordingly. On a set of parallel benchmarks, the proposed protocol reduces overall energy by 14.7, 10.7, 10.5, and 16.7 % and completion time by 2.5, 6.5, 4.5, and 9.5 % when compared to the previously proposed Victim Replication, Adaptive Selective Replication, Reactive-NUCA, and Static-NUCA LLC management schemes. An efficient classifier implementation is evaluated with an overhead of 5.44 KB, which translates to only 1.58 % on top of the Static-NUCA baseline’s cache related per-core storage.
引用
收藏
页码:718 / 752
页数:34
相关论文
共 50 条
  • [1] Locality-aware data replication in the last-level cache for large scale multicores
    Hijaz, Farrukh
    Shi, Qingchuan
    Kurian, George
    Devadas, Srinivas
    Khan, Omer
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (02): : 718 - 752
  • [2] Locality-Aware Data Replication in the Last-Level Cache
    Kurian, George
    Devadas, Srinivas
    Khan, Omer
    [J]. 2014 20TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA-20), 2014, : 1 - 12
  • [3] Reuse locality aware cache partitioning for last-level cache
    Shen, Fanfan
    He, Yanxiang
    Zhang, Jun
    Li, Qingan
    Li, Jianhua
    Xu, Chao
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2019, 74 : 319 - 330
  • [4] Locality-Aware Mapping and Scheduling for Multicores
    Ding, Wei
    Zhang, Yuanrui
    Kandemir, Mahmut
    Srinivas, Jithendra
    Yedlapalli, Praveen
    [J]. PROCEEDINGS OF THE 2013 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2013, : 335 - 346
  • [5] LDAC: Locality-Aware Data Access Control for Large-Scale Multicore Cache Hierarchies
    Shi, Qingchuan
    Kurian, George
    Hijaz, Farrukh
    Devadas, Srinivas
    Khan, Omer
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2016, 13 (04)
  • [6] LA-LLC: Inter-Core Locality-Aware Last-Level Cache to Exploit Many-to-Many Traffic in GPGPUs
    Zhao, Xia
    Liu, Yuxi
    Adileh, Almutaz
    Eeckhout, Lieven
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 42 - 45
  • [7] Last-level Cache Deduplication
    Tian, Yingying
    Khan, Samira M.
    Jimenez, Daniel A.
    Loh, Gabriel H.
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 53 - 62
  • [8] Software-Hardware Managed Last-level Cache Allocation Scheme for Large-Scale NVRAM-based Multicores Executing Parallel Data Analytics Applications
    Ahmad, Masab
    Dogan, Halit
    Checconi, Fabio
    Que, Xinyu
    Buono, Daniele
    Khan, Omer
    [J]. 2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 316 - 325
  • [9] Locality-aware cache random replacement policies
    Benedicte, Pedro
    Hernandez, Carles
    Abella, Jaume
    Cazorla, Francisco J.
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 93 : 48 - 61
  • [10] A Reuse-Degree Based Locality Classifier for Locality-Aware Data Replication
    Wu, Qianqian
    Ji, Zhenzhou
    [J]. IEEE ACCESS, 2019, 7 : 182207 - 182216