High Performance and Predictable Shared Last-level Cache for Safety-Critical Systems

被引:0
|
作者
Wu, Zhuanhao [1 ]
Kaushik, Anirudh [2 ]
Patel, Hiren [3 ]
机构
[1] Univ Waterloo, Waterloo, ON, Canada
[2] Intel Corp, Toronto, ON, Canada
[3] Univ Waterloo, Elect & Comp Engn, Waterloo, ON, Canada
关键词
Last-level cache; inclusive cache; safety-critical systems; worst-case latency analysis; back invalidation;
D O I
10.1145/3687308
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose ZeroCost-LLC (ZCLLC), a novel shared inclusive last-level cache (LLC) design for timing predictable multi-core platforms that offers lower worst-case latency (WCL) when compared with a traditional shared inclusive LLC design. ZCLLC achieves low WCL by eliminating certain memory operations in the form of cache line invalidations across the cache hierarchy that are a consequence of a core's memory request that misses in the cache hierarchy and when there is no vacant entry in the LLC to accommodate the fetched data for this request. In addition to low WCL, ZCLLC offers performance benefits in the form of additional caching capacity and unlike state-of-the-art approaches, ZCLLC does not impose any constraints on its usage across multiple cores. In this work, we describe the impact of LLC cache line invalidations on the WCL and systematically build solutions to eliminate these invalidations resulting in ZCLLC. We also present ZCLLC-OPT, an optimized variant of ZCLLC that offers lower WCL and improved average-case performance over ZCLLC. We apply optimizations to the shared bus arbitration mechanism and extend the micro-architecture of ZCLLC to allow for overlapping memory requests to the main memory. Our analysis reveals that the analytical WCL of a memory request under ZCLLC-OPT is 87.0%, 93.8%, and 97.1% lower than that under state-of-the-art LLC partition sharing techniques for 2, 4, and 8 cores, respectively. ZCLLC-OPT shows average-case performance speedups of 1.89x, 3.36x, and 6.24x compared with the state-of-the-art LLC partition sharing techniques for 2, 4, and 8 cores, respectively. When compared with the original ZCLLC that does not have any optimizations, ZCLLC-OPT shows lower analytical WCLs that are 76.5%, 82.6%, and 86.2% lower compared with ZCLLC-NORMAL for 2, 4, and 8 cores, respectively.
引用
收藏
页数:30
相关论文
共 50 条
  • [21] Cost aware cache replacement policy in shared last-level cache for hybrid memory based fog computing
    Jia, Gangyong
    Han, Guangjie
    Wang, Hao
    Wang, Feng
    ENTERPRISE INFORMATION SYSTEMS, 2018, 12 (04) : 435 - 451
  • [22] A Web Cache Replacement Strategy for Safety-Critical Systems
    Du, Jianhai
    Gao, Shiwei
    Lv, Jianghua
    Li, Qianqian
    Ma, Shilong
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2018, 25 (03): : 820 - 830
  • [23] Enhancing last-level cache performance by block bypassing and early miss determination
    Dybdahl, Haakon
    Stenstrom, Per
    ADVANCES IN COMPUTER SYSTEMS ARCHITECTURE, PROCEEDINGS, 2006, 4186 : 52 - 66
  • [24] Premier: A Concurrency-Aware Pseudo-Partitioning Framework for Shared Last-Level Cache
    Lu, Xiaoyang
    Wang, Rujia
    Sun, Xian-He
    2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021), 2021, : 391 - 394
  • [25] Shared Last-Level TLBs for Chip Multiprocessors
    Bhattacharjee, Abhishek
    Lustig, Daniel
    Martonosi, Margaret
    2011 IEEE 17TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2011, : 62 - 73
  • [26] RExCache: Rapid Exploration of Unified Last-level Cache
    Shwe, Su Myat Min
    Javaid, Haris
    Parameswaran, Sri
    2013 18TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2013, : 582 - 587
  • [27] A Pragmatic Delineation on Cache Bypass Algorithm in Last-Level Cache (LLC)
    Dash, Banchhanidhi
    Swain, Debabala
    Swain, Debabrata
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM, VOL 2, 2016, 411 : 37 - 45
  • [28] Write-back Aware Shared Last-level Cache Management for Hybrid Main Memory
    Zhang, Deshan
    Ju, Lei
    Zhao, Mengying
    Gao, Xiang
    Jia, Zhiping
    2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
  • [29] Runtime-Driven Shared Last-Level Cache Management for Task-Parallel Programs
    Pan, Abhisek
    Pai, Vijay S.
    PROCEEDINGS OF SC15: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2015,
  • [30] ABS: A Low-Cost Adaptive Controller for Prefetching in a Banked Shared Last-Level Cache
    Albericio, Jorge
    Gran, Ruben
    Ibanez, Pablo
    Vinals, Victor
    Maria Llaberia, Jose
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2012, 8 (04)