CG-OoO: Energy-Efficient Coarse-Grain Out-of-Order Execution Near In-Order Energy with Near Out-of-Order Performance

被引:1
|
作者
Mohammadi, Milad [1 ]
Aamodt, Tor M. [2 ]
Dally, William J. [3 ,4 ]
机构
[1] Stanford Univ, Comp Syst Lab, Gates Room 241, Stanford, CA 94305 USA
[2] Univ British Columbia, Dept Elect & Comp Engn, 2332 Main Mall, Vancouver, BC, Canada
[3] NVIDIA, Santa Clara, CA USA
[4] Stanford Univ, Comp Syst Lab, Gates Room 301, Stanford, CA 94305 USA
关键词
Energy efficiency; CPU architecture; block-level execution; DYNAMIC INSTRUMENTATION; INSTRUCTION SET; DESIGN; MICROARCHITECTURE; ARCHITECTURES; END;
D O I
10.1145/3151034
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We introduce the Coarse-Grain Out-of-Order (CG-OoO) general-purpose processor designed to achieve close to In-Order (InO) processor energy while maintaining Out-of-Order (OoO) performance. CG-OoO is an energy-performance-proportional architecture. Block-level code processing is at the heart of this architecture; CG-OoO speculates, fetches, schedules, and commits code at block-level granularity. It eliminates unnecessary accesses to energy-consuming tables and turns large tables into smaller, distributed tables that are cheaper to access. CG-OoO leverages compiler-level code optimizations to deliver efficient static code and exploits dynamic block-level and instruction-level parallelism. CG-OoO introduces Skipahead, a complexity effective, limited out-of-order instruction scheduling model. Through the energy efficiency techniques applied to the compiler and processor pipeline stages, CG-OoO closes 62% of the average energy gap between the InO and OoO baseline processors at the same area and nearly the same performance as the OoO. This makes CG-OoO 1.8x more efficient than the OoO on the energy-delay product inverse metric. CG-OoO meets the OoO nominal performance while trading off the peak scheduling performance for superior energy efficiency.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Mirage Cores: The Illusion of Many Out-of-order Cores Using In-order Hardware
    Padmanabha, Shruti
    Lukefahr, Andrew
    Das, Reetuparna
    Mahlke, Scott
    50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, : 745 - 758
  • [22] Cheap out-of-order execution using delayed issue
    Grossman, JP
    2000 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS & PROCESSORS, PROCEEDINGS, 2000, : 549 - 551
  • [23] Symbolic Predictive Cache Analysis for Out-of-Order Execution
    Huang, Zunchen
    Wang, Chao
    FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING, FASE 2022, 2022, 13241 : 163 - 183
  • [24] Automatic Refinement Checking of Pipelines with Out-of-Order Execution
    Srinivasan, Sudarshan K.
    IEEE TRANSACTIONS ON COMPUTERS, 2010, 59 (08) : 1138 - 1144
  • [25] Formal verification of out-of-order execution with incremental flushing
    Jones, RB
    Skakkebæk, JU
    Dill, DL
    FORMAL METHODS IN SYSTEM DESIGN, 2002, 20 (02) : 139 - 158
  • [26] OUT-OF-ORDER EXECUTION AND STRUCTURAL EQUIVALENCE OF SIMULATION MODELS
    Bergen-Hill, Tobin A.
    Page, Ernest H.
    PROCEEDINGS OF THE 2010 WINTER SIMULATION CONFERENCE, 2010, : 466 - 478
  • [27] Formal Verification of Out-of-Order Execution with Incremental Flushing
    Robert B. Jones
    Jens U. Skakkebæk
    David L. Dill
    Formal Methods in System Design, 2002, 20 : 139 - 158
  • [28] OVM: Out-of-order execution parallel Virtual Machine
    Bosilca, G
    Fedak, G
    Cappello, F
    FIRST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2001, : 212 - 220
  • [29] On the correctness of hardware scheduling mechanisms for out-of-order execution
    Mueller, SM
    Paul, WJ
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 1998, 8 (02) : 301 - 314
  • [30] An out-of-order execution technique for runtime binary translators
    Le, BC
    ACM SIGPLAN NOTICES, 1998, 33 (11) : 151 - 158