Shared Last-level Cache Management for GPGPUs with Hybrid Main Memory

被引：0

作者：

Wang, Guan ^{[1
]}

Cai, Xiaojun ^{[1
]}

Ju, Lei ^{[1
]}

Zang, Chuanqi ^{[1
]}

Zhao, Mengying ^{[1
]}

Jia, Zhiping ^{[1
]}

机构：

[1] Shandong Univ, Sch Comp Sci & Technol, Jinan, Shandong, Peoples R China

来源：

PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE) | 2017年

基金：

中国国家自然科学基金;

关键词：

HIGH-PERFORMANCE;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Memory intensive workloads become increasingly popular on general purpose graphics processing units (GPGPUs), and impose great challenges on the GPGPU memory subsystem design. On the other hand, with the recent development of nonvolatile memory (NVM) technologies, hybrid memory combining both DRAM and NVM achieves high performance, low power and high density simultaneously, which provides a promising main memory design for GPGPUs. In this work, we explore the shared last-level cache management for GPGPUs with consideration of the underlying hybrid main memory. In order to improve the overall memory subsystem performance, we exploit the characteristics of both the asymmetric read/write latency of the hybrid main memory architecture, as well as the memory coalescing feature of GPGPU. In particular, to reduce the average cost of L2 cache misses, we prioritize cache blocks from DRAM or NVM based on observation that operations to NVM part of main memory have large impact on the system performance. Furthermore, the cache management scheme also integrates the GPU memory coalescing and cache bypassing techniques to improve the overall cache hit ratio. Experimental results show that in the context of a hybrid main memory system, our proposed L2 cache management policy improves performance against the traditional LRU policy and a state-ofthe-art GPU cache strategy EABP [20] by up to 27.76% and 14%, respectively.

引用

页码：25 / 30

页数：6

共 50 条

[31] Evolving Skyrmion Racetrack Memory as Energy-Efficient Last-Level Cache Devices
Yang, Ya-Hui
Chen, Shuo-Han
Chang, Yuan-Hao
2022 ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2022, 2022,
[32] ABS: A Low-Cost Adaptive Controller for Prefetching in a Banked Shared Last-Level Cache
Albericio, Jorge
Gran, Ruben
Ibanez, Pablo
Vinals, Victor
Maria Llaberia, Jose
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2012, 8 (04)
[33] MemorAI: Energy-Efficient Last-Level Cache Memory Optimization for Virtualized RANs
Hidalgo, Ethan Sanchez
Lozano, J. Xavier Salvat
Ayala-Romero, Jose A.
Garcia-Saavedra, Andres
Li, Xi
Costa-Perez, Xavier
2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 25 - 30
[34] Exclusive Hierarchies for Predictable Sharing in Last-level Cache
Wang, Xinzhe
Wu, Zhuanhao
Pellizzoni, Rodolfo
Patel, Hiren
2024 IEEE 30TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, RTAS 2024, 2023, : 186 - 198
[35] Rowhammer Cache: A Last-level Cache for Low-Overhead Rowhammer Tracking
Singh, Aman
Panda, Biswabandan
2024 IEEE INTERNATIONAL SYMPOSIUM ON HARDWARE ORIENTED SECURITY AND TRUST, HOST, 2024, : 349 - 360
[36] SRAM- and STT-RAM-based hybrid, shared last-level cache for on-chip CPU–GPU heterogeneous architectures
Lan Gao
Rui Wang
Yunlong Xu
Hailong Yang
Zhongzhi Luan
Depei Qian
Han Zhang
Jihong Cai
The Journal of Supercomputing, 2018, 74 : 3388 - 3414
[37] Application-Specific Shared Last-Level Cache Optimization for Low-Power Embedded Systems
Zhao, Huatao
Ye, Jiongyao
Su, Xian
Watanabe, Takahiro
2015 IEEE 13TH INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS), 2015,
[38] Write Avoidance Cache Coherence Protocol for Non-volatile Memory as Last-Level Cache in Chip-Multiprocessor
Choi, Ju Hee
Kwak, Jong Wook
Jhon, Chu Shik
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (08): : 2166 - 2169
[39] Performance and Energy Assessment of Last-Level Cache Replacement Policies
Peneau, Pierre-Yves
Novo, David
Bruguier, Florent
Sassatelli, Gilles
Gamatie, Abdoulaye
PROCEEDINGS OF 2017 FIRST INTERNATIONAL CONFERENCE ON EMBEDDED & DISTRIBUTED SYSTEMS (EDIS 2017), 2017, : 149 - 154
[40] Locality-Aware Data Replication in the Last-Level Cache
Kurian, George
Devadas, Srinivas
Khan, Omer
2014 20TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA-20), 2014, : 1 - 12

← 1 2 3 4 5 →