Reducing Load Latency with Cache Level Prediction

被引：6

作者：

Jalili, Majid ^{[1
]}

Erez, Mattan ^{[1
]}

机构：

[1] Univ Texas Austin, Austin, TX 78712 USA

来源：

2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022) | 2022年

基金：

美国国家科学基金会;

关键词：

SPECULATION;

D O I：

10.1109/HPCA53966.2022.00054

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

High load latency that results from deep cache hierarchies and relatively slow main memory is an important limiter of single-thread performance. Data prefetch helps reduce this latency by fetching data up the hierarchy before it is requested by load instructions. However, data prefetching has shown to be imperfect in many situations. We propose cache-level prediction to complement prefetchers. Our method predicts which memory hierarchy level a load will access allowing the memory loads to start earlier, and thereby saves many cycles. The predictor provides high prediction accuracy at the cost of just one cycle added latency to L1 misses. Level prediction reduces the memory access latency by 20% on average, and provides speedup of 10.3% over a conventional baseline, and 6.1% over a boosted baseline on generic, graph, and HPC applications.

引用

页码：648 / 661

页数：14

共 50 条

[21] Reducing static energy of cache memories via prediction-table-less way prediction
Sakanaka, A
Sato, T
INTEGRATED CIRCUIT AND SYSTEM DESIGN: POWER AND TIMING MODELING, OPTIMIZATION AND SIMULATION, 2003, 2799 : 530 - 539
[22] Reducing Latency and Network Load Using Location-Aware Memcache Architectures
Talaga, Paul G.
Chapin, Steve J.
WEB INFORMATION SYSTEMS AND TECHNOLOGIES, WEBIST 2012, 2013, 140 : 53 - 69
[23] Correction Prediction: Reducing Error Correction Latency for On-Chip Memories
Duwe, Henry
Jian, Xun
Kumar, Rakesh
2015 IEEE 21ST INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2015, : 463 - 475
[24] Reducing the Load of Metadata Server by Changing Cache Policy Dynamically in Distributed File System
Matsuno, Masaya
Kawashima, Ryota
Saito, Shoich
Matsuo, Hiroshi
2013 FIRST INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2013, : 173 - 179
[25] Reducing power with an LO instruction cache using history-based prediction
Tang, W
Veidenbaum, AV
Nicolau, A
INTERNATIONAL WORKSHOP ON INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, 2002, : 11 - 18
[26] On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors
Ammenouche, Samir
Touati, Sid Ahmed Ali
Jalby, William
HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, : 196 - 205
[27] First-level instruction cache design for reducing dynamic energy consumption
Kim, CH
Shim, S
Kwak, JW
Chung, SW
Jhon, CS
EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, 2005, 3553 : 103 - 111
[28] Reducing cache and TLB power by exploiting memory region and privilege level semantics
Fang, Zhen
Zhao, Li
Jiang, Xiaowei
Lu, Shih-lien
Iyer, Ravi
Li, Tong
Lee, Seung Eun
JOURNAL OF SYSTEMS ARCHITECTURE, 2013, 59 (06) : 279 - 295
[29] A software-controlled insertion policy for reducing last level cache pollution
Wang, J. (wangjing@mprc.pku.edu.cn), 1600, Chinese Institute of Electronics (40):
[30] A last-write-touch prediction scheme used to reduce remote Cache miss latency
Xia, Jun
Xu, Weixia
Pang, Zhengbin
Zhang, Jun
Chang, Junsheng
Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2015, 37 (01): : 14 - 20

← 1 2 3 4 5 →