Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-level Cache Hierarchies

被引:20
|
作者
Nori, Anant Vithal [1 ]
Gaur, Jayesh [1 ]
Rai, Siddharth [2 ]
Subramoney, Sreenivas [1 ]
Wang, Hong [1 ]
机构
[1] Intel, Microarchitecture Res Lab, Santa Clara, CA 95054 USA
[2] Indian Inst Technol Kanpur, Kanpur, Uttar Pradesh, India
关键词
Criticality; Caching; Prefetching;
D O I
10.1109/ISCA.2018.00019
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches are typically built as a multi-level cache hierarchy. One such popular hierarchy that has been adopted by modern microprocessors is the three level cache hierarchy. Building a three level cache hierarchy enables a low average hit latency since most requests are serviced from faster inner level caches. This has motivated recent microprocessors to deploy large level-2 (L2) caches that can help further reduce the average hit latency. In this paper, we do a fundamental analysis of the popular three level cache hierarchy and understand its performance delivery using program criticality. Through our detailed analysis we show that the current trend of increasing L2 cache sizes to reduce average hit latency is, in fact, an inefficient design choice. We instead propose Criticality Aware Tiered Cache Hierarchy (CATCH) that utilizes an accurate detection of program criticality in hardware and using a novel set of inter-cache prefetchers ensures that on-die data accesses that lie on the critical path of execution are served at the latency of the fastest level-1 (L1) cache. The last level cache (LLC) serves the purpose of reducing slow memory accesses, thereby making the large L2 cache redundant for most applications. The area saved by eliminating the L2 cache can then be used to create more efficient processor configurations. Our simulation results show that CATCH outperforms the three level cache hierarchy with a large 1 MB L2 and exclusive LLC by an average of 8.4%, and a baseline with 256 KB L2 and inclusive LLC by 10.3%. We also show that CATCH enables a powerful framework to explore broad chip-level area, performance and power tradeoffs in cache hierarchy design. Supported by CATCH, we evaluate radical architecture directions such as eliminating the L2 altogether and show that such architectures can yield 4.5% performance gain over the baseline at nearly 30% lesser area or improve the performance by 7.3% at the same area while reducing energy consumption by 11%.
引用
收藏
页码:96 / 109
页数:14
相关论文
共 50 条
  • [1] A Data-sharing Aware and Scalable Cache Miss Rates Model for Multi-core Processors with Multi-level Cache Hierarchies
    Wang, Guangmin
    Ge, Jiancong
    Yan, Yunhao
    Ling, Ming
    [J]. 2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2019, : 267 - 274
  • [2] MorphCache: A Reconfigurable Adaptive Multi-level Cache Hierarchy
    Srikantaiah, Shekhar
    Kultursay, Emre
    Zhang, Tao
    Kandemir, Mahmut
    Irwin, Mary Jane
    Xie, Yuan
    [J]. 2011 IEEE 17TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2011, : 231 - 242
  • [3] ASA: An Adaptive Space Allocation algorithm for cache management in multi-level cache hierarchy
    Ou, Li
    Sankar, Karthik
    He, Xubin Ben
    [J]. PROCEEDINGS OF THE THIRTY-EIGHTH SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 2004, : 524 - 528
  • [4] Multi-level cache hierarchy evaluation for programmable media processors
    Fritts, J
    Wolf, W
    [J]. 2000 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 2000, : 228 - 237
  • [5] Evaluating Application Vulnerability to Soft Errors in Multi-level Cache Hierarchy
    Ma, Zhe
    Carlson, Trevor
    Heirman, Wim
    Eeckhout, Lieven
    [J]. EURO-PAR 2011: PARALLEL PROCESSING WORKSHOPS, PT II, 2012, 7156 : 272 - 281
  • [6] Reducing cache conflicts by multi-level cache partitioning and array elements mapping
    Chang, CY
    Sheu, JP
    Chen, HC
    [J]. JOURNAL OF SUPERCOMPUTING, 2002, 22 (02): : 197 - 219
  • [7] Reducing cache conflicts by multi-level cache partitioning and array elements mapping
    Chang, CY
    Sheu, JP
    Chen, HC
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2000, : 195 - 202
  • [8] Reducing Cache Conflicts by Multi-Level Cache Partitioning and Array Elements Mapping
    Chih-Yung Chang
    Jang-Ping Sheu
    Hsi-Chiuen Chen
    [J]. The Journal of Supercomputing, 2002, 22 : 197 - 219
  • [9] NVMain Extension for Multi-Level Cache Systems
    Khan, Asif Ali
    Hameed, Fazal
    Castrillon, Jeronimo
    [J]. PROCEEDINGS OF THE RAPIDO'18 WORKSHOP HIPEAC'18 CONFERENCE, 2015,
  • [10] Hybrid Multi-level Cache Management Policy
    Chikhale, Krupal
    Shrawankar, Urmila
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 1119 - 1123