Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-level Cache Hierarchies

被引：20

作者：

Nori, Anant Vithal ^{[1
]}

Gaur, Jayesh ^{[1
]}

Rai, Siddharth ^{[2
]}

Subramoney, Sreenivas ^{[1
]}

Wang, Hong ^{[1
]}

机构：

[1] Intel, Microarchitecture Res Lab, Santa Clara, CA 95054 USA

[2] Indian Inst Technol Kanpur, Kanpur, Uttar Pradesh, India

来源：

2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA) | 2018年

关键词：

Criticality; Caching; Prefetching;

D O I：

10.1109/ISCA.2018.00019

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches are typically built as a multi-level cache hierarchy. One such popular hierarchy that has been adopted by modern microprocessors is the three level cache hierarchy. Building a three level cache hierarchy enables a low average hit latency since most requests are serviced from faster inner level caches. This has motivated recent microprocessors to deploy large level-2 (L2) caches that can help further reduce the average hit latency. In this paper, we do a fundamental analysis of the popular three level cache hierarchy and understand its performance delivery using program criticality. Through our detailed analysis we show that the current trend of increasing L2 cache sizes to reduce average hit latency is, in fact, an inefficient design choice. We instead propose Criticality Aware Tiered Cache Hierarchy (CATCH) that utilizes an accurate detection of program criticality in hardware and using a novel set of inter-cache prefetchers ensures that on-die data accesses that lie on the critical path of execution are served at the latency of the fastest level-1 (L1) cache. The last level cache (LLC) serves the purpose of reducing slow memory accesses, thereby making the large L2 cache redundant for most applications. The area saved by eliminating the L2 cache can then be used to create more efficient processor configurations. Our simulation results show that CATCH outperforms the three level cache hierarchy with a large 1 MB L2 and exclusive LLC by an average of 8.4%, and a baseline with 256 KB L2 and inclusive LLC by 10.3%. We also show that CATCH enables a powerful framework to explore broad chip-level area, performance and power tradeoffs in cache hierarchy design. Supported by CATCH, we evaluate radical architecture directions such as eliminating the L2 altogether and show that such architectures can yield 4.5% performance gain over the baseline at nearly 30% lesser area or improve the performance by 7.3% at the same area while reducing energy consumption by 11%.

引用

页码：96 / 109

页数：14

共 50 条

[1] A Data-sharing Aware and Scalable Cache Miss Rates Model for Multi-core Processors with Multi-level Cache Hierarchies
Wang, Guangmin
Ge, Jiancong
Yan, Yunhao
Ling, Ming
[J]. 2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2019, : 267 - 274
[2] MorphCache: A Reconfigurable Adaptive Multi-level Cache Hierarchy
Srikantaiah, Shekhar
Kultursay, Emre
Zhang, Tao
Kandemir, Mahmut
Irwin, Mary Jane
Xie, Yuan
[J]. 2011 IEEE 17TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2011, : 231 - 242
[3] ASA: An Adaptive Space Allocation algorithm for cache management in multi-level cache hierarchy
Ou, Li
Sankar, Karthik
He, Xubin Ben
[J]. PROCEEDINGS OF THE THIRTY-EIGHTH SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 2004, : 524 - 528
[4] Multi-level cache hierarchy evaluation for programmable media processors
Fritts, J
Wolf, W
[J]. 2000 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2000, : 228 - 237
[5] Evaluating Application Vulnerability to Soft Errors in Multi-level Cache Hierarchy
Ma, Zhe
Carlson, Trevor
Heirman, Wim
Eeckhout, Lieven
[J]. EURO-PAR 2011: PARALLEL PROCESSING WORKSHOPS, PT II, 2012, 7156 : 272 - 281
[6] Reducing cache conflicts by multi-level cache partitioning and array elements mapping
Chang, CY
Sheu, JP
Chen, HC
[J]. JOURNAL OF SUPERCOMPUTING, 2002, 22 (02): : 197 - 219
[7] Reducing cache conflicts by multi-level cache partitioning and array elements mapping
Chang, CY
Sheu, JP
Chen, HC
[J]. SEVENTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2000, : 195 - 202
[8] Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping
Chih-Yung Chang
Jang-Ping Sheu
Hsi-Chiuen Chen
[J]. The Journal of Supercomputing, 2002, 22 : 197 - 219
[9] NVMain Extension for Multi-Level Cache Systems
Khan, Asif Ali
Hameed, Fazal
Castrillon, Jeronimo
[J]. PROCEEDINGS OF THE RAPIDO'18 WORKSHOP HIPEAC'18 CONFERENCE, 2015,
[10] Hybrid Multi-level Cache Management Policy
Chikhale, Krupal
Shrawankar, Urmila
[J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 1119 - 1123

← 1 2 3 4 5 →