Trace cache miss tolerance for deeply pipelined superscalar processors

被引：1

作者：

Reinman, G. ^{[1
]}

Pitigoi-Aron, G. ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA

来源：

IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES | 2006年 / 153卷 / 05期

关键词：

D O I：

10.1049/ip-cdt:20050161

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The trace cache is a technique that provides accurate, high bandwidth instruction fetch. However, when a desired instruction trace is not found in the cache, conventional instruction fetch and decode must be used to satisfy the trace request. Such auxiliary fetch hardware can be expensive in terms of energy, area and complexity. An approach to combine a trace cache and conventional instruction fetch hardware using a decoupled design is. explored. The design enables the processor to dynamically switch between trace ID and PC-based prediction methods and helps to hide the latency associated with the instruction memory path. The decoupled design with accelerated slow path instruction delivery and no instruction cache is able to provide comparable benefit to a front-end with an 8 kB instruction cache (within 2% of the instructions per cycle with the cache). High tolerance can be demonstrated for both trace table misses and increased memory latency when scaling down the size of the trace table and scaling up the L2 access latency.

引用

页码：355 / 361

页数：7

共 27 条

[1] The limits of speculative trace reuse on deeply pipelined processors
Pilla, ML
Navaux, POA
da Costa, AT
França, FMG
Childers, BR
Soffa, ML
15TH SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2003, : 36 - 44
[2] Compiler Optimization for Superscalar and Pipelined Processors
Bharadwaj, Vishnu P.
Rao, Mahesh
PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING, VLSI, ELECTRICAL CIRCUITS AND ROBOTICS (DISCOVER), 2016, : 232 - 236
[3] Secure and Efficient Software Masking on Superscalar Pipelined Processors
Gigerl, Barbara
Primas, Robert
Mangard, Stefan
ADVANCES IN CRYPTOLOGY - ASIACRYPT 2021, PT II, 2021, 13091 : 3 - 32
[4] On the functional test of the BTB logic in pipelined and superscalar processors
Changdao, D.
Graziano, M.
Sanchez, E.
Reorda, M. Sonza
Zamboni, M.
Zhifan, N.
2013 14TH IEEE LATIN-AMERICAN TEST WORKSHOP (LATW2013), 2013,
[5] Incorporating fault tolerance in superscalar processors
Franklin, M
3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 1996, : 301 - 306
[6] Boosting SMT trace processors performance with data cache miss sensitive thread scheduling mechanism
Wang, Kai-feng
Ji, Zhen-zhou
Hu, Ming-zeng
MICROPROCESSORS AND MICROSYSTEMS, 2006, 30 (05) : 225 - 233
[7] A transparent transient faults tolerance mechanism for superscalar processors
Sato, T
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (12): : 2508 - 2516
[8] ELIMINATING INTERLOCKS IN DEEPLY PIPELINED PROCESSORS BY DELAY ENFORCED MULTISTREAMING
MCCRACKIN, DC
IEEE TRANSACTIONS ON COMPUTERS, 1991, 40 (10) : 1125 - 1132
[9] Reducing state loss for effective trace sampling of superscalar processors
Conte, TM
Hirsch, MA
Menezes, KN
INTERNATIONAL CONFERENCE ON COMPUTER DESIGN - VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 1996, : 468 - 477
[10] Exploring the performance of split data cache schemes on superscalar processors and symmetric multiprocessors
Sahuquillo, J
Petit, S
Pont, A
Milutinovic, V
JOURNAL OF SYSTEMS ARCHITECTURE, 2005, 51 (08) : 451 - 469

← 1 2 3 →