High Performance and Predictable Shared Last-level Cache for Safety-Critical Systems

被引:0
|
作者
Wu, Zhuanhao [1 ]
Kaushik, Anirudh [2 ]
Patel, Hiren [3 ]
机构
[1] Univ Waterloo, Waterloo, ON, Canada
[2] Intel Corp, Toronto, ON, Canada
[3] Univ Waterloo, Elect & Comp Engn, Waterloo, ON, Canada
关键词
Last-level cache; inclusive cache; safety-critical systems; worst-case latency analysis; back invalidation;
D O I
10.1145/3687308
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose ZeroCost-LLC (ZCLLC), a novel shared inclusive last-level cache (LLC) design for timing predictable multi-core platforms that offers lower worst-case latency (WCL) when compared with a traditional shared inclusive LLC design. ZCLLC achieves low WCL by eliminating certain memory operations in the form of cache line invalidations across the cache hierarchy that are a consequence of a core's memory request that misses in the cache hierarchy and when there is no vacant entry in the LLC to accommodate the fetched data for this request. In addition to low WCL, ZCLLC offers performance benefits in the form of additional caching capacity and unlike state-of-the-art approaches, ZCLLC does not impose any constraints on its usage across multiple cores. In this work, we describe the impact of LLC cache line invalidations on the WCL and systematically build solutions to eliminate these invalidations resulting in ZCLLC. We also present ZCLLC-OPT, an optimized variant of ZCLLC that offers lower WCL and improved average-case performance over ZCLLC. We apply optimizations to the shared bus arbitration mechanism and extend the micro-architecture of ZCLLC to allow for overlapping memory requests to the main memory. Our analysis reveals that the analytical WCL of a memory request under ZCLLC-OPT is 87.0%, 93.8%, and 97.1% lower than that under state-of-the-art LLC partition sharing techniques for 2, 4, and 8 cores, respectively. ZCLLC-OPT shows average-case performance speedups of 1.89x, 3.36x, and 6.24x compared with the state-of-the-art LLC partition sharing techniques for 2, 4, and 8 cores, respectively. When compared with the original ZCLLC that does not have any optimizations, ZCLLC-OPT shows lower analytical WCLs that are 76.5%, 82.6%, and 86.2% lower compared with ZCLLC-NORMAL for 2, 4, and 8 cores, respectively.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] Predictable Sharing of Last-level Cache Partitions for Multi-core Safety-critical Systems
    Wu, Zhuanhao
    Patel, Hiren
    PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1273 - 1278
  • [2] Exclusive Hierarchies for Predictable Sharing in Last-level Cache
    Wang, Xinzhe
    Wu, Zhuanhao
    Pellizzoni, Rodolfo
    Patel, Hiren
    2024 IEEE 30TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, RTAS 2024, 2023, : 186 - 198
  • [3] Cache Friendliness-Aware Management of Shared Last-Level Caches for High Performance Multi-Core Systems
    Kaseridis, Dimitris
    Iqbal, Muhammad Faisal
    John, Lizy Kurian
    IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (04) : 874 - 887
  • [4] Managing Shared Last-Level Cache in a Heterogeneous Multicore Processor
    Mekkat, Vineeth
    Holey, Anup
    Yew, Pen-Chung
    Zhai, Antonia
    2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 225 - 234
  • [5] Reducing Contention in Shared Last-Level Cache for Throughput Processors
    Kuo, Hsien-Kai
    Lai, Bo-Cheng Charles
    Jou, Jing-Yang
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2014, 20 (01) : 1 - 28
  • [6] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
    Han Jun Bae
    Lynn Choi
    The Journal of Supercomputing, 2020, 76 : 7521 - 7544
  • [7] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
    Bae, Han Jun
    Choi, Lynn
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (10): : 7521 - 7544
  • [8] Last-level Cache Deduplication
    Tian, Yingying
    Khan, Samira M.
    Jimenez, Daniel A.
    Loh, Gabriel H.
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 53 - 62
  • [9] Shared Last-level Cache Management for GPGPUs with Hybrid Main Memory
    Wang, Guan
    Cai, Xiaojun
    Ju, Lei
    Zang, Chuanqi
    Zhao, Mengying
    Jia, Zhiping
    PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2017, : 25 - 30
  • [10] Dataplane Performance Optimization with Last-Level Cache Management
    Sridhar, K. N.
    Ranganath, Sunku
    Ilangovan, Dakshina
    Lin, Yang
    2018 IEEE CONFERENCE ON NETWORK FUNCTION VIRTUALIZATION AND SOFTWARE DEFINED NETWORKS (NFV-SDN), 2018,