Dense Footprint Cache: Capacity-Efficient Die-Stacked DRAM Last Level Cache

被引:0
|
作者
Shin, Seunghee [1 ]
Kim, Sihong [2 ]
Solihin, Yan [1 ]
机构
[1] North Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA
[2] Samsung Elect, Suwon, South Korea
关键词
Die-stacked DRAM; last-level cache; replacement policy;
D O I
10.1145/2989081.2989096
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Die-stacked DRAM technology enables a large Last Level Cache (LLC) that provides high bandwidth data access to the processor. However, it requires a large tag array that may take a significant portion of the on-chip SRAM budget. To reduce this SRAM overhead, systems like Intel Haswell relies on a large block (Mblock) size. One drawback of a large Mblock size is that many bytes of an Mblock are not needed by the processor but are fetched into the cache. A recent technique (Footprint cache) to solve this problem works by dividing the Mblock into smaller blocks where only blocks predicted to be needed by the processor are brought into the LLC. While it helps to alleviate the excessive bandwidth consumption from fetching unneeded blocks, the capacity waste remains: only blocks that are predicted useful are fetched and allocated, and the remaining area of the Mblock is left empty, creating holes. Unfortunately, holes create significant capacity overheads which could have been used for useful data, hence wasted refresh power on useless data. In this paper, we propose a new design, Dense Footprint Cache (DFC). Similar to Footprint cache, DFC uses a large Mblock and relies on useful block prediction in order to reduce memory bandwidth consumption. However, when blocks of an Mblock are fetched, the blocks are placed contiguously in the cache, thereby eliminating holes, increasing capacity and power efficiency, and increasing performance. Mblocks in DFC have variable sizes and a cache set has a variable associativity, hence it presents new challenges in designing its management policies (placement, replacement, and update). Through simulation of Big Data applications, we show that DFC reduces LLC miss ratios by about 43%, speeds up applications by 9.5%, while consuming 4.3% less energy on average.
引用
收藏
页码:191 / 203
页数:13
相关论文
共 42 条
  • [1] Unison Cache: A Scalable and Effective Die-Stacked DRAM Cache
    Jevdjic, Djordje
    Loh, Gabriel H.
    Kaynak, Cansu
    Falsafi, Babak
    [J]. 2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, : 25 - 37
  • [2] Tidy Cache: Improving Data Placement in Die-stacked DRAM Caches
    Armejach, Adria
    Cristal, Adrian
    Unsal, Osman S.
    [J]. 2015 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2015, : 65 - 73
  • [3] Efficient RAS Support for Die-stacked DRAM
    Jeon, Hyeran
    Loh, Gabriel H.
    Annavaram, Murali
    [J]. 2014 IEEE INTERNATIONAL TEST CONFERENCE (ITC), 2014,
  • [4] Efficient STT-RAM Last-Level-Cache Architecture to Replace DRAM Cache
    Hameed, Fazal
    Menard, Christian
    Castrillon, Jeronimo
    [J]. MEMSYS 2017: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2017, : 141 - 151
  • [5] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
    Bae, Han Jun
    Choi, Lynn
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (10): : 7521 - 7544
  • [6] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
    Han Jun Bae
    Lynn Choi
    [J]. The Journal of Supercomputing, 2020, 76 : 7521 - 7544
  • [7] Packet Processing Architecture Using Last-Level-Cache Slices and Interleaved 3D-Stacked DRAM
    Korikawa, Tomohiro
    Kawabata, Akio
    He, Fujun
    Oki, Eiji
    [J]. IEEE ACCESS, 2020, 8 : 59290 - 59304
  • [8] COORDINATING DRAM AND LAST-LEVEL-CACHE POLICIES WITH THE VIRTUAL WRITE QUEUE
    Stuecheli, Jeffrey
    Kaseridis, Dimitris
    John, Lizy K.
    Daly, David
    Hunter, Hillery C.
    [J]. IEEE MICRO, 2011, 31 (01) : 90 - 98
  • [9] The Virtual Write Queue: Coordinating DRAM and Last-Level Cache Policies
    Stuecheli, Jeffrey
    Kaseridis, Dimitris
    Daly, David
    Hunter, Hillery C.
    John, Lizy K.
    [J]. ISCA 2010: THE 37TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2010, : 72 - 82
  • [10] Dynamically Reconfigurable Hybrid Cache: An Energy-Efficient Last-Level Cache Design
    Chen, Yu-Ting
    Cong, Jason
    Huang, Hui
    Liu, Bin
    Liu, Chunyue
    Potkonjak, Miodrag
    Reinman, Glenn
    [J]. DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 45 - 50