Mirage Cores: The Illusion of Many Out-of-order Cores Using In-order Hardware

被引：5

作者：

Padmanabha, Shruti ^{[1
]}

Lukefahr, Andrew ^{[2
]}

Das, Reetuparna ^{[1
]}

Mahlke, Scott ^{[1
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] Indiana Univ, Bloomington, IN 47405 USA

来源：

50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO) | 2017年

基金：

美国国家科学基金会;

关键词：

Heterogeneous multicores; Energy-efficient architectures; CMP scheduling; POWER MANAGEMENT; PERFORMANCE; IMPACT;

D O I：

10.1145/3123939.3123969

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Heterogenous chip multiprocessors (Het-CMPs) offer a combination of large Out-of-Order (OoO) cores optimized for high single-threaded performance and small In-Order (InO) cores optimized for low-energy and area costs. Due to practical constraints, CMP designers must choose to either optimize for total system throughput by utilizing many InO cores or maximize single-thread execution with fewer OoO cores. We propose Mirage Cores, a novel Het-CMP design where clusters of InO cores are architected around an OoO in a manner that optimizes for both throughput and single-thread performance. The insight behind Mirage Cores is that InO cores can achieve near-OoO performance if they are provided with the dynamic instruction schedule of an OoO core. To leverage this, Mirage Cores employs an OoO core as an optimal instruction schedule generator as well as a high-performance alternative for all neighboring InO cores. We also develop intelligent runtime schedulers which orchestrate the arbitration and migration of applications between the InO cores and the central OoO. Fast and timely transfer of dynamic schedules from the OoO to InO allows Mirage Cores to create the appearance of all OoO cores to the user using underlying In-Order hardware. Overall, with an 8 InO per OoO configuration, Mirage Cores can achieve on average 84% of the performance of a CMP with 8 OoO cores, a 28% increase relative to current systems, while conserving 55% of energy and 25% of area costs. We find that we can scale the design to around 12 InOs per OoO before starvation for the OoO starts to hamper system performance.

引用

页码：745 / 758

页数：14

共 50 条

[31] Raft with Out-of-order Executions
Gu X.-S.
Wei H.-F.
Qiao L.
Huang Y.
Ruan Jian Xue Bao/Journal of Software, 2021, 32 (06): : 1748 - 1778
[32] Out-of-order commit processors
Cristal, A
Ortega, D
Llosa, J
Valero, M
10TH INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2004, : 48 - 59
[33] Out-of-order vector architectures
Espasa, R
Valero, M
Smith, JE
THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, : 160 - 170
[34] Cheap out-of-order execution using delayed issue
Grossman, JP
2000 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, PROCEEDINGS, 2000, : 549 - 551
[35] Regional Out-of-Order Writes in Total Store Order
Singh, Sawan
Jimborean, Alexandra
Ros, Alberto
PACT '20: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2020, : 205 - 216
[36] Out-of-order instruction fetch using multiple sequencers
Oberoi, P
Sohi, G
2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDING, 2002, : 14 - 23
[37] Fast out-of-order processor simulation using memoization
Schnarr, E
Larus, JR
ACM SIGPLAN NOTICES, 1998, 33 (11) : 283 - 294
[38] NEW HARDWARE SCHEME SUPPORTING PRECISE EXCEPTION HANDLING FOR OUT-OF-ORDER EXECUTION
HWANG, GC
KYUNG, CM
ELECTRONICS LETTERS, 1994, 30 (01) : 16 - 17
[39] Predictable Out-of-order Execution Using Virtual Traces
Whitham, Jack
Audsley, Neil
RTSS: 2008 REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 2008, : 445 - 455
[40] ProfileMe: Hardware support for instruction-level profiling on out-of-order processors
Dean, J
Hicks, JE
Waldspurger, CA
Weihl, WE
Chrysos, G
THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, : 292 - 302

← 1 2 3 4 5 →