Shared Last-Level Cache Management and Memory Scheduling for GPGPUs with Hybrid Main Memory

被引：0

作者：

Wang, Guan ^{[1
]}

Zang, Chuanqi ^{[2
]}

Ju, Lei ^{[2
]}

Zhao, Mengying ^{[1
]}

Cai, Xiaojun ^{[1
]}

Jia, Zhiping ^{[1
]}

机构：

[1] Shandong Univ, Sch Comp Sci & Technol, Qingdao, Peoples R China

[2] Shandong Univ, Sch Software, Jinan, Shandong, Peoples R China

来源：

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS | 2018年 / 17卷 / 04期

基金：

国家重点研发计划;

关键词：

NVM; GPGPU; hybrid memory; cache management; cache bypassing; memory scheduling; HIGH-PERFORMANCE; ALLOCATION; PCM;

D O I：

10.1145/3230643

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Memory intensive workloads become increasingly popular on general purpose graphics processing units (GPGPUs), and impose great challenges on the GPGPU memory subsystem design. On the other hand, with the recent development of non-volatile memory (NVM) technologies, hybridmemory combining both DRAM and NVM achieves high performance, low power, and high density simultaneously, which provides a promising main memory design for GPGPUs. In this article, we explore the shared last-level cache management for GPGPUs with consideration of the underlying hybrid main memory. To improve the overall memory subsystem performance, we exploit the characteristics of both the asymmetric read/write latency of the hybrid main memory architecture, as well as the memory coalescing feature of GPGPUs. In particular, to reduce the average cost of L2 cache misses, we prioritize cache blocks from DRAM or NVM based on observations that operations to NVM part of main memory have a large impact on the system performance. Furthermore, the cache management scheme also integrates the GPU memory coalescing and cache bypassing techniques to improve the overall system performance. To minimize the impact of memory divergence behaviors among simultaneously executed groups of threads, we propose a hybrid main memory and warp aware memory scheduling mechanism for GPGPUs. Experimental results show that in the context of a hybrid main memory system, our proposed L2 cache management policy and memory scheduling mechanism improve performance by 15.69% on average for memory intensive benchmarks, whereas the maximum gain can be up to 29% and achieve an average memory subsystem energy reduction of 21.27%.

引用

页数：25

共 50 条

[1] Shared Last-level Cache Management for GPGPUs with Hybrid Main Memory
Wang, Guan
Cai, Xiaojun
Ju, Lei
Zang, Chuanqi
Zhao, Mengying
Jia, Zhiping
PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2017, : 25 - 30
[2] Write-back Aware Shared Last-level Cache Management for Hybrid Main Memory
Zhang, Deshan
Ju, Lei
Zhao, Mengying
Gao, Xiang
Jia, Zhiping
2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
[3] HAP: Hybrid-memory-Aware Partition in Shared Last-Level Cache
Wei, Wei
Jiang, Dejun
Xiong, Jin
Chen, Mingyu
2014 32ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2014, : 28 - 35
[4] HAP: Hybrid-Memory-Aware Partition in Shared Last-Level Cache
Wei, Wei
Jiang, Dejun
Xiong, Jin
Chen, Mingyu
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (03)
[5] Cost aware cache replacement policy in shared last-level cache for hybrid memory based fog computing
Jia, Gangyong
Han, Guangjie
Wang, Hao
Wang, Feng
ENTERPRISE INFORMATION SYSTEMS, 2018, 12 (04) : 435 - 451
[6] Algorithm-Switching-Based Last-Level Cache Structure with Hybrid Main Memory Architecture
Li, Xian-Shu
Yoon, Su-Kyung
Kim, Jeong-Geun
Burgstaller, Bernd
Kim, Shin-Dug
COMPUTER JOURNAL, 2020, 63 (01): : 123 - 136
[7] A Last-Level Cache Management for Enhancing Endurance of Phase Change Memory
Lee, Won Jun
Kim, Chang Hyun
Kim, Seon Wook
2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,
[8] Monolithically Integrating Non-Volatile Main Memory over the Last-Level Cache
Walden, Candace
Singh, Devesh
Jagasivamani, Meenatchi
Li, Shang
Kang, Luyi
Asnaashari, Mehdi
Dubois, Sylvain
Jacob, Bruce
Yeung, Donald
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (04)
[9] Dynamic Adaptive Replacement Policy in Shared Last-Level Cache of DRAM/PCM Hybrid Memory for Big Data Storage
Jia, Gangyong
Han, Guangjie
Jiang, Jinfang
Liu, Li
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (04) : 1951 - 1960
[10] Research on Optimizing Last Level Cache Performance for Hybrid Main Memory
Zheng, Hua
Ming, Zhong
Qiu, Meikang
Zhang, Xi
SMART COMPUTING AND COMMUNICATION, SMARTCOM 2017, 2018, 10699 : 144 - 153

← 1 2 3 4 5 →