Trace cache miss tolerance for deeply pipelined superscalar processors

被引:1
|
作者
Reinman, G. [1 ]
Pitigoi-Aron, G. [1 ]
机构
[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
来源
关键词
D O I
10.1049/ip-cdt:20050161
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The trace cache is a technique that provides accurate, high bandwidth instruction fetch. However, when a desired instruction trace is not found in the cache, conventional instruction fetch and decode must be used to satisfy the trace request. Such auxiliary fetch hardware can be expensive in terms of energy, area and complexity. An approach to combine a trace cache and conventional instruction fetch hardware using a decoupled design is. explored. The design enables the processor to dynamically switch between trace ID and PC-based prediction methods and helps to hide the latency associated with the instruction memory path. The decoupled design with accelerated slow path instruction delivery and no instruction cache is able to provide comparable benefit to a front-end with an 8 kB instruction cache (within 2% of the instructions per cycle with the cache). High tolerance can be demonstrated for both trace table misses and increased memory latency when scaling down the size of the trace table and scaling up the L2 access latency.
引用
收藏
页码:355 / 361
页数:7
相关论文
共 27 条
  • [1] The limits of speculative trace reuse on deeply pipelined processors
    Pilla, ML
    Navaux, POA
    da Costa, AT
    França, FMG
    Childers, BR
    Soffa, ML
    15TH SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2003, : 36 - 44
  • [2] Compiler Optimization for Superscalar and Pipelined Processors
    Bharadwaj, Vishnu P.
    Rao, Mahesh
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING, VLSI, ELECTRICAL CIRCUITS AND ROBOTICS (DISCOVER), 2016, : 232 - 236
  • [3] Secure and Efficient Software Masking on Superscalar Pipelined Processors
    Gigerl, Barbara
    Primas, Robert
    Mangard, Stefan
    ADVANCES IN CRYPTOLOGY - ASIACRYPT 2021, PT II, 2021, 13091 : 3 - 32
  • [4] On the functional test of the BTB logic in pipelined and superscalar processors
    Changdao, D.
    Graziano, M.
    Sanchez, E.
    Reorda, M. Sonza
    Zamboni, M.
    Zhifan, N.
    2013 14TH IEEE LATIN-AMERICAN TEST WORKSHOP (LATW2013), 2013,
  • [5] Incorporating fault tolerance in superscalar processors
    Franklin, M
    3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 1996, : 301 - 306
  • [6] Boosting SMT trace processors performance with data cache miss sensitive thread scheduling mechanism
    Wang, Kai-feng
    Ji, Zhen-zhou
    Hu, Ming-zeng
    MICROPROCESSORS AND MICROSYSTEMS, 2006, 30 (05) : 225 - 233
  • [7] A transparent transient faults tolerance mechanism for superscalar processors
    Sato, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (12): : 2508 - 2516
  • [8] ELIMINATING INTERLOCKS IN DEEPLY PIPELINED PROCESSORS BY DELAY ENFORCED MULTISTREAMING
    MCCRACKIN, DC
    IEEE TRANSACTIONS ON COMPUTERS, 1991, 40 (10) : 1125 - 1132
  • [9] Reducing state loss for effective trace sampling of superscalar processors
    Conte, TM
    Hirsch, MA
    Menezes, KN
    INTERNATIONAL CONFERENCE ON COMPUTER DESIGN - VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 1996, : 468 - 477
  • [10] Exploring the performance of split data cache schemes on superscalar processors and symmetric multiprocessors
    Sahuquillo, J
    Petit, S
    Pont, A
    Milutinovic, V
    JOURNAL OF SYSTEMS ARCHITECTURE, 2005, 51 (08) : 451 - 469