Spatial Locality-Aware Cache Partitioning for Effective Cache Sharing

被引:13
|
作者
Gupta, Saurabh [1 ]
Zhou, Huiyang [2 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN USA
[2] North Carolina State Univ, Raleigh, NC USA
关键词
shared last level cache; cache partitioning; spatial locality; cache management; high bandwidth memory;
D O I
10.1109/ICPP.2015.24
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In modern multi-core processors, last-level caches (LLCs) are typically shared among multiple cores. Previous works have shown that such sharing is beneficial as different workloads have different needs for cache capacity, and logical partitioning of capacity can improve system performance. However, what is missing in previous works on partitioning shared LLCs is that the heterogeneity in spatial locality among workloads has not been explored. In other words, all the cores use the same block/line size in shared LLCs. In this work, we highlight that exploiting spatial locality enables much more effective cache sharing. The fundamental reason is that for many memory intensive workloads, their cache capacity requirements can be drastically reduced when a large block size is employed, therefore they can effectively donate more capacity to other workloads. To leverage spatial locality for cache partitioning effectively, we first propose a simple yet effective mechanism to measure both spatial and temporal locality at run-time. The locality information is then used to determine both the proper block size and the capacity assigned to each workload. Our experiments show that our Spatial Locality-aware Cache Partitioning (SLCP) significantly outperforms the previous works. We also present several case studies that dissect the effectiveness of SLCP compared to the existing approaches.
引用
收藏
页码:150 / 159
页数:10
相关论文
共 50 条
  • [1] Locality-aware cache random replacement policies
    Benedicte, Pedro
    Hernandez, Carles
    Abella, Jaume
    Cazorla, Francisco J.
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 93 : 48 - 61
  • [2] DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip Multiprocessors
    Holtryd, Nadja
    Manivannan, Madhavan
    Stenstrom, Per
    Pericas, Miquel
    [J]. 2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, 2020, : 578 - 589
  • [3] Reuse locality aware cache partitioning for last-level cache
    Shen, Fanfan
    He, Yanxiang
    Zhang, Jun
    Li, Qingan
    Li, Jianhua
    Xu, Chao
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2019, 74 : 319 - 330
  • [4] Locality-Aware Data Replication in the Last-Level Cache
    Kurian, George
    Devadas, Srinivas
    Khan, Omer
    [J]. 2014 20TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA-20), 2014, : 1 - 12
  • [5] LACS: A Locality-Aware Cost-Sensitive Cache Replacement Algorithm
    Kharbutli, Mazen
    Sheikh, Rami
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (08) : 1975 - 1987
  • [6] Cache Storage Optimization for Locality-Aware Peer-to-Peer Multimedia Distribution
    Di Pascale, Emanuele
    Ruffini, Marco
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2015, : 5565 - 5570
  • [7] A Locality-Aware Write Filter Cache for Energy Reduction of STTRAM-Based L1 Data Cache
    Kong, Joonho
    [J]. JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, 2016, 16 (01) : 80 - 90
  • [8] A Spatial and Temporal Locality-Aware Adaptive Cache Design With Network Optimization for Tiled Many-Core Architectures
    Wang, Mingyu
    Li, Zhaolin
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (09) : 2419 - 2433
  • [9] Locality-aware data replication in the last-level cache for large scale multicores
    Farrukh Hijaz
    Qingchuan Shi
    George Kurian
    Srinivas Devadas
    Omer Khan
    [J]. The Journal of Supercomputing, 2016, 72 : 718 - 752
  • [10] Locality-aware data replication in the last-level cache for large scale multicores
    Hijaz, Farrukh
    Shi, Qingchuan
    Kurian, George
    Devadas, Srinivas
    Khan, Omer
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (02): : 718 - 752