Reducing Load Latency with Cache Level Prediction

被引:6
|
作者
Jalili, Majid [1 ]
Erez, Mattan [1 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
SPECULATION;
D O I
10.1109/HPCA53966.2022.00054
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High load latency that results from deep cache hierarchies and relatively slow main memory is an important limiter of single-thread performance. Data prefetch helps reduce this latency by fetching data up the hierarchy before it is requested by load instructions. However, data prefetching has shown to be imperfect in many situations. We propose cache-level prediction to complement prefetchers. Our method predicts which memory hierarchy level a load will access allowing the memory loads to start earlier, and thereby saves many cycles. The predictor provides high prediction accuracy at the cost of just one cycle added latency to L1 misses. Level prediction reduces the memory access latency by 20% on average, and provides speedup of 10.3% over a conventional baseline, and 6.1% over a boosted baseline on generic, graph, and HPC applications.
引用
收藏
页码:648 / 661
页数:14
相关论文
共 50 条
  • [21] Reducing static energy of cache memories via prediction-table-less way prediction
    Sakanaka, A
    Sato, T
    INTEGRATED CIRCUIT AND SYSTEM DESIGN: POWER AND TIMING MODELING, OPTIMIZATION AND SIMULATION, 2003, 2799 : 530 - 539
  • [22] Reducing Latency and Network Load Using Location-Aware Memcache Architectures
    Talaga, Paul G.
    Chapin, Steve J.
    WEB INFORMATION SYSTEMS AND TECHNOLOGIES, WEBIST 2012, 2013, 140 : 53 - 69
  • [23] Correction Prediction: Reducing Error Correction Latency for On-Chip Memories
    Duwe, Henry
    Jian, Xun
    Kumar, Rakesh
    2015 IEEE 21ST INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2015, : 463 - 475
  • [24] Reducing the Load of Metadata Server by Changing Cache Policy Dynamically in Distributed File System
    Matsuno, Masaya
    Kawashima, Ryota
    Saito, Shoich
    Matsuo, Hiroshi
    2013 FIRST INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2013, : 173 - 179
  • [25] Reducing power with an LO instruction cache using history-based prediction
    Tang, W
    Veidenbaum, AV
    Nicolau, A
    INTERNATIONAL WORKSHOP ON INNOVATIVE ARCHITECTURE FOR FUTURE GENERATION HIGH-PERFORMANCE PROCESSORS AND SYSTEMS, 2002, : 11 - 18
  • [26] On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors
    Ammenouche, Samir
    Touati, Sid Ahmed Ali
    Jalby, William
    HPCC: 2009 11TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2009, : 196 - 205
  • [27] First-level instruction cache design for reducing dynamic energy consumption
    Kim, CH
    Shim, S
    Kwak, JW
    Chung, SW
    Jhon, CS
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, 2005, 3553 : 103 - 111
  • [28] Reducing cache and TLB power by exploiting memory region and privilege level semantics
    Fang, Zhen
    Zhao, Li
    Jiang, Xiaowei
    Lu, Shih-lien
    Iyer, Ravi
    Li, Tong
    Lee, Seung Eun
    JOURNAL OF SYSTEMS ARCHITECTURE, 2013, 59 (06) : 279 - 295
  • [29] A software-controlled insertion policy for reducing last level cache pollution
    Wang, J. (wangjing@mprc.pku.edu.cn), 1600, Chinese Institute of Electronics (40):
  • [30] A last-write-touch prediction scheme used to reduce remote Cache miss latency
    Xia, Jun
    Xu, Weixia
    Pang, Zhengbin
    Zhang, Jun
    Chang, Junsheng
    Guofang Keji Daxue Xuebao/Journal of National University of Defense Technology, 2015, 37 (01): : 14 - 20