Locality-aware data replication in the last-level cache for large scale multicores

被引：0

作者：

Farrukh Hijaz

Qingchuan Shi

George Kurian

Srinivas Devadas

Omer Khan

机构：

[1] University of Connecticut,

[2] Massachusetts Institute of Technology,undefined

[3] Google,undefined

来源：

The Journal of Supercomputing | 2016年 / 72卷

关键词：

Multicore; Cache hierarchy; Data management; Energy efficiency;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Next generation large single-chip multicores will process massive data with varying degree of locality. Harnessing on-chip data locality to optimize the utilization of on-chip cache and network resources is of fundamental importance. We propose a locality-aware selective data replication protocol for the last-level cache (LLC). The goal is to lower memory access latency and energy by only replicating cache lines with high reuse in the LLC slice of the requesting core, while simultaneously keep the off-chip miss rate low. The approach relies on low-overhead yet highly accurate in-hardware runtime cache line level classifier that only allows replication of cache lines with high reuse. Furthermore, a classifier captures the LLC pressure at the existing replica locations and adapts its replication decision accordingly. On a set of parallel benchmarks, the proposed protocol reduces overall energy by 14.7, 10.7, 10.5, and 16.7 % and completion time by 2.5, 6.5, 4.5, and 9.5 % when compared to the previously proposed Victim Replication, Adaptive Selective Replication, Reactive-NUCA, and Static-NUCA LLC management schemes. An efficient classifier implementation is evaluated with an overhead of 5.44 KB, which translates to only 1.58 % on top of the Static-NUCA baseline’s cache related per-core storage.

引用

页码：718 / 752

页数：34

共 50 条

[1] Locality-aware data replication in the last-level cache for large scale multicores
Hijaz, Farrukh
Shi, Qingchuan
Kurian, George
Devadas, Srinivas
Khan, Omer
[J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (02): : 718 - 752
[2] Locality-Aware Data Replication in the Last-Level Cache
Kurian, George
Devadas, Srinivas
Khan, Omer
[J]. 2014 20TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA-20), 2014, : 1 - 12
[3] Reuse locality aware cache partitioning for last-level cache
Shen, Fanfan
He, Yanxiang
Zhang, Jun
Li, Qingan
Li, Jianhua
Xu, Chao
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2019, 74 : 319 - 330
[4] Locality-Aware Mapping and Scheduling for Multicores
Ding, Wei
Zhang, Yuanrui
Kandemir, Mahmut
Srinivas, Jithendra
Yedlapalli, Praveen
[J]. PROCEEDINGS OF THE 2013 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2013, : 335 - 346
[5] LDAC: Locality-Aware Data Access Control for Large-Scale Multicore Cache Hierarchies
Shi, Qingchuan
Kurian, George
Hijaz, Farrukh
Devadas, Srinivas
Khan, Omer
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2016, 13 (04)
[6] LA-LLC: Inter-Core Locality-Aware Last-Level Cache to Exploit Many-to-Many Traffic in GPGPUs
Zhao, Xia
Liu, Yuxi
Adileh, Almutaz
Eeckhout, Lieven
[J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 42 - 45
[7] Last-level Cache Deduplication
Tian, Yingying
Khan, Samira M.
Jimenez, Daniel A.
Loh, Gabriel H.
[J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 53 - 62
[8] Software-Hardware Managed Last-level Cache Allocation Scheme for Large-Scale NVRAM-based Multicores Executing Parallel Data Analytics Applications
Ahmad, Masab
Dogan, Halit
Checconi, Fabio
Que, Xinyu
Buono, Daniele
Khan, Omer
[J]. 2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 316 - 325
[9] Locality-aware cache random replacement policies
Benedicte, Pedro
Hernandez, Carles
Abella, Jaume
Cazorla, Francisco J.
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 93 : 48 - 61
[10] A Reuse-Degree Based Locality Classifier for Locality-Aware Data Replication
Wu, Qianqian
Ji, Zhenzhou
[J]. IEEE ACCESS, 2019, 7 : 182207 - 182216

← 1 2 3 4 5 →