Dense Footprint Cache: Capacity-Efficient Die-Stacked DRAM Last Level Cache

被引：0

作者：

Shin, Seunghee ^{[1
]}

Kim, Sihong ^{[2
]}

Solihin, Yan ^{[1
]}

机构：

[1] North Carolina State Univ, Dept Elect & Comp Engn, Raleigh, NC 27695 USA

[2] Samsung Elect, Suwon, South Korea

来源：

MEMSYS 2016: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS | 2016年

关键词：

Die-stacked DRAM; last-level cache; replacement policy;

D O I：

10.1145/2989081.2989096

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Die-stacked DRAM technology enables a large Last Level Cache (LLC) that provides high bandwidth data access to the processor. However, it requires a large tag array that may take a significant portion of the on-chip SRAM budget. To reduce this SRAM overhead, systems like Intel Haswell relies on a large block (Mblock) size. One drawback of a large Mblock size is that many bytes of an Mblock are not needed by the processor but are fetched into the cache. A recent technique (Footprint cache) to solve this problem works by dividing the Mblock into smaller blocks where only blocks predicted to be needed by the processor are brought into the LLC. While it helps to alleviate the excessive bandwidth consumption from fetching unneeded blocks, the capacity waste remains: only blocks that are predicted useful are fetched and allocated, and the remaining area of the Mblock is left empty, creating holes. Unfortunately, holes create significant capacity overheads which could have been used for useful data, hence wasted refresh power on useless data. In this paper, we propose a new design, Dense Footprint Cache (DFC). Similar to Footprint cache, DFC uses a large Mblock and relies on useful block prediction in order to reduce memory bandwidth consumption. However, when blocks of an Mblock are fetched, the blocks are placed contiguously in the cache, thereby eliminating holes, increasing capacity and power efficiency, and increasing performance. Mblocks in DFC have variable sizes and a cache set has a variable associativity, hence it presents new challenges in designing its management policies (placement, replacement, and update). Through simulation of Big Data applications, we show that DFC reduces LLC miss ratios by about 43%, speeds up applications by 9.5%, while consuming 4.3% less energy on average.

引用

页码：191 / 203

页数：13

共 42 条

[1] Unison Cache: A Scalable and Effective Die-Stacked DRAM Cache
Jevdjic, Djordje
Loh, Gabriel H.
Kaynak, Cansu
Falsafi, Babak
[J]. 2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, : 25 - 37
[2] Tidy Cache: Improving Data Placement in Die-stacked DRAM Caches
Armejach, Adria
Cristal, Adrian
Unsal, Osman S.
[J]. 2015 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2015, : 65 - 73
[3] Efficient RAS Support for Die-stacked DRAM
Jeon, Hyeran
Loh, Gabriel H.
Annavaram, Murali
[J]. 2014 IEEE INTERNATIONAL TEST CONFERENCE (ITC), 2014,
[4] Efficient STT-RAM Last-Level-Cache Architecture to Replace DRAM Cache
Hameed, Fazal
Menard, Christian
Castrillon, Jeronimo
[J]. MEMSYS 2017: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2017, : 141 - 151
[5] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
Bae, Han Jun
Choi, Lynn
[J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (10): : 7521 - 7544
[6] Filter cache: filtering useless cache blocks for a small but efficient shared last-level cache
Han Jun Bae
Lynn Choi
[J]. The Journal of Supercomputing, 2020, 76 : 7521 - 7544
[7] Packet Processing Architecture Using Last-Level-Cache Slices and Interleaved 3D-Stacked DRAM
Korikawa, Tomohiro
Kawabata, Akio
He, Fujun
Oki, Eiji
[J]. IEEE ACCESS, 2020, 8 : 59290 - 59304
[8] COORDINATING DRAM AND LAST-LEVEL-CACHE POLICIES WITH THE VIRTUAL WRITE QUEUE
Stuecheli, Jeffrey
Kaseridis, Dimitris
John, Lizy K.
Daly, David
Hunter, Hillery C.
[J]. IEEE MICRO, 2011, 31 (01) : 90 - 98
[9] The Virtual Write Queue: Coordinating DRAM and Last-Level Cache Policies
Stuecheli, Jeffrey
Kaseridis, Dimitris
Daly, David
Hunter, Hillery C.
John, Lizy K.
[J]. ISCA 2010: THE 37TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2010, : 72 - 82
[10] Dynamically Reconfigurable Hybrid Cache: An Energy-Efficient Last-Level Cache Design
Chen, Yu-Ting
Cong, Jason
Huang, Hui
Liu, Bin
Liu, Chunyue
Potkonjak, Miodrag
Reinman, Glenn
[J]. DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 45 - 50

← 1 2 3 4 5 →