Decoupled Fused Cache: Fusing a Decoupled LLC with a DRAM Cache

被引：10

作者：

Vasilakis, Evangelos ^{[1
]}

Papaefstathiou, Vassilis ^{[2
]}

Trancoso, Pedro ^{[1
]}

Sourdis, Ioannis ^{[1
]}

机构：

[1] Chalmer Univ Technol, CSE Dept, Rannvagen 6, Gothenburg, Sweden

[2] Fdn Res & Technol Hellas FORTH, 100 Nikolaou Plastira Str, Iraklion, Greece

来源：

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION | 2019年 / 15卷 / 04期

基金：

欧洲研究理事会; 欧盟地平线“2020”;

关键词：

Caches; 3D stacking; DRAM; processor; memory;

D O I：

10.1145/3293447

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

DRAM caches have shown excellent potential in capturing the spatial and temporal data locality of applications capitalizing on advances of 3D-stacking technology; however, they are still far from their ideal performance. Besides the unavoidable DRAM access to fetch the requested data, tag access is in the critical path, adding significant latency and energy costs. Existing approaches are not able to remove these overheads and in some cases limit DRAM cache design options. For instance, caching DRAM cache tags adds constant latency to every access; accessing the DRAM cache using the 'I'1,B calls for OS support and DRAM cachelines as large as a page; reusing the last-level cache (LLC) tags to access the DRAM cache limits LLC performance as it requires indexing the LLC using higher-order address bits. In this article, we introduce Decoupled Fused Cache, a DRAM cache design that alleviates the cost of tag accesses by fusing DRAM cache tags with the tags of the on-chip LLC without affecting LLC performance. In essence, the Decoupled Fused Cache relies in most cases on the LLC tag access to retrieve the required information for accessing the DRAM cache while avoiding additional overheads. Compared to current DRAM cache designs of the same cacheline size, Decoupled Fused Cache improves system performance by 6% on average and by 16% to 18% for large cacheline sizes. Finally, Decoupled Fused Cache reduces DRAM cache traffic by 18% and DRAM cache energy consumption by 7%.

引用

页数：23

共 50 条

[1] Decoupled Dynamic Cache Segmentation
Khan, Samira M.
Wang, Zhe
Jimenez, Daniel A.
2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 235 - 246
[2] Decoupled modified-bit cache
Takahashi, M
Oba, N
Kobayashi, H
Nakamura, T
CONFERENCE PROCEEDINGS OF THE 1996 IEEE FIFTEENTH ANNUAL INTERNATIONAL PHOENIX CONFERENCE ON COMPUTERS AND COMMUNICATIONS, 1996, : 136 - 143
[3] FusionCache: using LLC Tags for DRAM Cache
Vasilakis, Evangelos
Papaefstathiou, Vassilis
Trancoso, Pedro
Sourdis, Ioannis
PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 593 - 596
[4] DDCache: Decoupled and Delegable Cache Data and Metadata
Hossain, Hemayet
Dwarkadas, Sandhya
Huang, Michael C.
18TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 2009, : 227 - 236
[5] DEAM: Decoupled, Expressive, Area-Efficient Metadata Cache
Peng Liu
Lei Fang
Michael C. Huang
Journal of Computer Science and Technology, 2014, 29 : 679 - 691
[6] THE CACHE DRAM ARCHITECTURE - A DRAM WITH AN ON-CHIP CACHE MEMORY
HIDAKA, H
MATSUDA, Y
ASAKURA, M
FUJISHIMA, K
IEEE MICRO, 1990, 10 (02) : 14 - 25
[7] DECOUPLED COMPRESSED CACHE: EXPLOITING SPATIAL LOCALITY FOR ENERGY OPTIMIZATION
Sardashti, Somayeh
Wood, David A.
IEEE MICRO, 2014, 34 (03) : 91 - 99
[8] DEAM: Decoupled, Expressive, Area-Efficient Metadata Cache
Liu, Peng
Fang, Lei
Huang, Michael C.
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2014, 29 (04) : 679 - 691
[9] DEAM: Decoupled, Expressive, Area-Efficient Metadata Cache
刘鹏
方磊
黄巍
Journal of Computer Science & Technology, 2014, 29 (04) : 679 - 691
[10] Decoupled access DRAM architecture
Veidenbaum, AV
Gallivan, KA
INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, PROCEEDINGS, 1998, : 94 - 103

← 1 2 3 4 5 →