Trace cache miss tolerance for deeply pipelined superscalar processors

被引:1
|
作者
Reinman, G. [1 ]
Pitigoi-Aron, G. [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
来源
关键词
D O I
10.1049/ip-cdt:20050161
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The trace cache is a technique that provides accurate, high bandwidth instruction fetch. However, when a desired instruction trace is not found in the cache, conventional instruction fetch and decode must be used to satisfy the trace request. Such auxiliary fetch hardware can be expensive in terms of energy, area and complexity. An approach to combine a trace cache and conventional instruction fetch hardware using a decoupled design is. explored. The design enables the processor to dynamically switch between trace ID and PC-based prediction methods and helps to hide the latency associated with the instruction memory path. The decoupled design with accelerated slow path instruction delivery and no instruction cache is able to provide comparable benefit to a front-end with an 8 kB instruction cache (within 2% of the instructions per cycle with the cache). High tolerance can be demonstrated for both trace table misses and increased memory latency when scaling down the size of the trace table and scaling up the L2 access latency.
引用
收藏
页码:355 / 361
页数:7
相关论文
共 27 条
  • [21] Hiding cache miss penalty using priority-based execution for embedded processors
    Park, Sanghyun
    Shrivastava, Aviral
    Paek, Yunheung
    2008 DESIGN, AUTOMATION AND TEST IN EUROPE, VOLS 1-3, 2008, : 1032 - +
  • [22] Miss reduction in embedded processors through dynamic, power-friendly cache design
    Bournoutian, Garo
    Orailoglu, Alex
    2008 45TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 304 - 309
  • [23] DSTRIDE: Data-cache miss-address-based stride prefetching scheme for multimedia processors
    Hariprakash, G
    Achutharaman, R
    Omondi, AR
    PROCEEDINGS OF THE 6TH AUSTRALASIAN COMPUTER SYSTEMS ARCHITECTURE CONFERENCE, ACSAC 2001, 2001, 23 (04): : 62 - 70
  • [24] CPU Scheduling for Power/Energy Management on Multicore Processors Using Cache Miss and Context Switch Data
    Datta, Ajoy K.
    Patel, Rajesh
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (05) : 1190 - 1199
  • [25] A Data-sharing Aware and Scalable Cache Miss Rates Model for Multi-core Processors with Multi-level Cache Hierarchies
    Wang, Guangmin
    Ge, Jiancong
    Yan, Yunhao
    Ling, Ming
    2019 IEEE 25TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2019, : 267 - 274
  • [26] Path-classified trace cache for improving hit ratio in wide-issue processors
    Yang, JH
    Park, IC
    Kyung, CM
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1999, E82D (10) : 1338 - 1343
  • [27] Low-cost microarchitectural techniques for enhancing the prediction of return addresses on high-performance trace cache processors
    Shi, Yunhe
    Ozer, Emre
    Gregg, David
    COMPUTER AND INFORMATION SCIENCES - ISCIS 2006, PROCEEDINGS, 2006, 4263 : 248 - 257