Mirage Cores: The Illusion of Many Out-of-order Cores Using In-order Hardware

被引：5

作者：

Padmanabha, Shruti ^{[1
]}

Lukefahr, Andrew ^{[2
]}

Das, Reetuparna ^{[1
]}

Mahlke, Scott ^{[1
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

[2] Indiana Univ, Bloomington, IN 47405 USA

来源：

50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO) | 2017年

基金：

美国国家科学基金会;

关键词：

Heterogeneous multicores; Energy-efficient architectures; CMP scheduling; POWER MANAGEMENT; PERFORMANCE; IMPACT;

D O I：

10.1145/3123939.3123969

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Heterogenous chip multiprocessors (Het-CMPs) offer a combination of large Out-of-Order (OoO) cores optimized for high single-threaded performance and small In-Order (InO) cores optimized for low-energy and area costs. Due to practical constraints, CMP designers must choose to either optimize for total system throughput by utilizing many InO cores or maximize single-thread execution with fewer OoO cores. We propose Mirage Cores, a novel Het-CMP design where clusters of InO cores are architected around an OoO in a manner that optimizes for both throughput and single-thread performance. The insight behind Mirage Cores is that InO cores can achieve near-OoO performance if they are provided with the dynamic instruction schedule of an OoO core. To leverage this, Mirage Cores employs an OoO core as an optimal instruction schedule generator as well as a high-performance alternative for all neighboring InO cores. We also develop intelligent runtime schedulers which orchestrate the arbitration and migration of applications between the InO cores and the central OoO. Fast and timely transfer of dynamic schedules from the OoO to InO allows Mirage Cores to create the appearance of all OoO cores to the user using underlying In-Order hardware. Overall, with an 8 InO per OoO configuration, Mirage Cores can achieve on average 84% of the performance of a CMP with 8 OoO cores, a 28% increase relative to current systems, while conserving 55% of energy and 25% of area costs. We find that we can scale the design to around 12 InOs per OoO before starvation for the OoO starts to hamper system performance.

引用

页码：745 / 758

页数：14

共 50 条

[1] Recycling Data Slack in Out-of-Order Cores
Ravi, Gokul Subramanian
Lipasti, Mikko H.
2019 25TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2019, : 545 - 557
[2] Efficiently Scaling Out-of-Order Cores for Simultaneous Multithreading
Sleiman, Faissal M.
Wenisch, Thomas F.
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 431 - 443
[3] Federation: Repurposing scalar cores for out-of-order instruction issue
Tarjan, David
Boyer, Michael
Skadron, Kevin
2008 45TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 772 - 775
[4] Achieving out-of-order performance with almost in-order complexity
Tseng, Francis
Patt, Yale N.
ISCA 2008 PROCEEDINGS: 35TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2008, : 3 - 12
[5] Out-of-order Transmission for In-order Arrival Scheduling for Multipath TCP
Yang, Fan
Wang, Qi
Amer, Paul D.
2014 28TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA), 2014, : 749 - 752
[6] Memory Hierarchy Calibration Based on Real Hardware In-order Cores for Accurate Simulation
Huppert, Quentin
Evenblij, Timon
Perumkunnil, Manu
Catthoor, Francky
Torres, Lionel
Novo, David
PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 707 - 710
[7] Analyzing the Impact of Supporting Out-of-Order Communication on In-order Performance with iWARP
Balaji, P.
Feng, W.
Bhagvat, S.
Panda, D. K.
Thakur, R.
Gropp, W.
2007 ACM/IEEE SC07 CONFERENCE, 2010, : 615 - +
[8] RIO: ROB-Centric In-Order Modeling of Out-of-Order Processors
Heirman, Wim
Eyerman, Stijn
Du Bois, Kristof
Hur, Ibrahim
IEEE COMPUTER ARCHITECTURE LETTERS, 2021, 20 (01) : 78 - 81
[9] Student Research Poster: Software Out-of-Order Execution for In-Order Architectures
Tran, Kim-Anh
2016 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES (PACT), 2016, : 458 - 458
[10] Reusing cached schedules in an out-of-order processor with in-order issue logic
Palomar, Oscar
Juan, Toni
Navarro, Juan J.
2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2009, : 246 - +

← 1 2 3 4 5 →