BP-NUCA: CACHE PRESSURE-AWARE MIGRATION FOR HIGH-PERFORMANCE CACHING IN CMPS

被引：0

作者：

Jia, Xiaomin ^{[1
]}

Jiang, Jiang ^{[1
]}

Wang, Yongwen ^{[1
]}

Qi, Shubo ^{[1
]}

Zhao, Tianlei ^{[1
]}

Fu, Guitao ^{[1
]}

Zhang, Minxuan ^{[1
]}

机构：

[1] Natl Univ Def Technol, Sch Comp, Dept Microelect, Changsha 410073, Hunan, Peoples R China

来源：

COMPUTING AND INFORMATICS | 2011年 / 30卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Chip multi-processors (CMPs); last-level cache (LLC); block migration; non-uniform cache architecture (NUCA);

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As the momentum behind Chip Multi-Processors (CMPs) continues to grow, Last Level Cache (LLC) management becomes a crucial issue to CMPs because off-chip accesses often involve a big latency. Private cache design is distinguished by smaller local access latency, good performance isolation and easy scalability, thus is becoming an attractive design alternative for LLC of CMPs. This paper proposes Balanced Private Non-Uniform Cache Architecture (BP-NUCA), a new LLC architecture that starts from private cache design for smaller local access latency and good performance isolation, then introduces a low cost mechanism to dynamically migrate private blocks among peer private caches of LLC to improve the overall space utilization. BP-NUCA achieves this by measuring the cache access pressure level that each cache set experiences at runtime and then using the information to guide block migration among different private caches of LLC. A heavily accessed set, namely a set with high access pressure level, is allowed to migrate its evicted blocks to peer private caches, replacing blocks of sets which are with the same index and have low access pressure level. By migrating blocks from heavily accessed cache sets to less accessed cache sets, BP-NUCA effectively balances space utilization of LLC among different cores. Experimental results using a full system CMP simulator show that BP-NUCA improves the overall throughput by as much as 20.3%, 12.4%, 14.5% and 18.0% (on average 7.7%, 4.4%, 4.0% and 6.1 %) over private cache, shared cache, shared cache management scheme UCP and private cache organization CC respectively on a 4-core CMP for SPEC CPU2006 benchmarks.

引用

页码：1037 / 1060

页数：24

共 14 条

[1] Cache pressure-aware caching scheme for content-centric networking
Luo, Xi
An, Ying
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (02) : 795 - 806
[2] Adaptive Spill-Receive for Robust High-Performance Caching in CMPs
Qureshi, Moinuddin K.
[J]. HPCA-15 2009: FIFTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2009, : 45 - 54
[3] PACMan: Prefetch-Aware Cache Management for High Performance Caching
Wu, Carole-Jean
Jaleel, Aamer
Martonosi, Margaret
Steely, Simon C., Jr.
Emer, Joel
[J]. PROCEEDINGS OF THE 2011 44TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO 44), 2011, : 442 - 453
[4] Cooperative Partitioning: Energy-Efficient Cache Partitioning for High-Performance CMPs
Sundararajan, Karthik T.
Porpodas, Vasileios
Jones, Timothy M.
Topham, Nigel P.
Franke, Bjoern
[J]. 2012 IEEE 18TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2012, : 311 - 322
[5] Flash-Aware High-Performance and Endurable Cache
Xia, Qianbin
Xiao, Weijun
[J]. 2015 IEEE 23RD INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2015), 2015, : 47 - 50
[6] A Reusability-Aware Cache Memory Sharing Technique for High-Performance Low-Power CMPs with Private L2 Caches
Youn, Sungjune
Kim, Hyunhee
Kim, Jihong
[J]. ISLPED'07: PROCEEDINGS OF THE 2007 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2007, : 56 - 61
[7] LP-NUCA: Networks-in-Cache for High-Performance Low-Power Embedded Processors
Suarez Gracia, Dario
Dimitrakopoulos, Giorgos
Monreal Arnal, Teresa
Katevenis, Manolis G. H.
Vinals Yufera, Victor
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2012, 20 (08) : 1510 - 1523
[8] High-Performance and Endurable Cache Management for Flash-Based Read Caching
Xia, Qianbin
Xiao, Weijun
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (12) : 3518 - 3531
[9] GL-Cache: Group-level learning for efficient and high-performance caching
Yang, Juncheng
Mao, Ziming
Yue, Yao
Rashmi, K. V.
[J]. PROCEEDINGS OF THE 21ST USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES, FAST 2023, 2023, : 115 - 133
[10] LAC: A Workload Intensity-Aware Caching Scheme for High-Performance SSDs
Sun, Hui
Tong, Haoqiang
Yue, Yinliang
Qin, Xiao
[J]. IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (07) : 1738 - 1752

← 1 2 →