Revolver: Processor Architecture for Power Efficient Loop Execution

被引:0
|
作者
Hayenga, Mitchell [1 ,2 ]
Naresh, Vignyan Reddy Kothinti [2 ]
Lipasti, Mikko H. [2 ]
机构
[1] ARM Inc, Cambridge, England
[2] Univ Wisconsin, Madison, WI 53706 USA
关键词
CACHE;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the rise of mobile and cloud-based computing, modern processor design has become the task of achieving maximum power efficiency at specific performance targets. This trend, coupled with dwindling improvements in single-threaded performance, has led architects to predominately focus on energy efficiency. In this paper we note that for the majority of benchmarks, a substantial portion of execution time is spent executing simple loops. Capitalizing on the frequency of loops, we design an out-of-order processor architecture that achieves an aggressive level of performance while minimizing the energy consumed during the execution of loops. The Revolver architecture achieves energy efficiency during loop execution by enabling "in-place execution" of loops within the processor's out-of-order backend. Essentially, a few static instances of each loop instruction are dispatched to the out-of-order execution core by the processor frontend. The static instruction instances may each be executed multiple times in order to complete all necessary loop iterations. During loop execution the processor frontend, including instruction fetch, branch prediction, decode, allocation, and dispatch logic, can be completely clock gated. Additionally we propose a mechanism to pre-execute future loop iteration load instructions, thereby realizing parallelism beyond the loop iterations currently executing within the processor core. Employing Revolver across three benchmark suites, we eliminate 20, 55, and 84% of all frontend instruction dispatches. Overall, we find Revolver maintains performance, while resulting in 5.3%-18.3% energy-delay benefit over loop buffers or micro-op cache techniques alone.
引用
收藏
页码:591 / 602
页数:12
相关论文
共 50 条
  • [1] Power efficient processor architecture and the cell processor
    Hofstee, HP
    [J]. 11TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, : 258 - 262
  • [2] THE FUNCTION PROCESSOR - AN ARCHITECTURE FOR EFFICIENT EXECUTION OF RECURSIVE FUNCTIONS
    VASELL, J
    VASELL, J
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1991, 505 : 101 - 118
  • [3] Parallel In-Order Execution Architecture for Low-Power Processor
    Lee, Kyungmin
    Jeong, Ipoom
    Ro, Won Woo
    [J]. PROCEEDINGS INTERNATIONAL SOC DESIGN CONFERENCE 2017 (ISOCC 2017), 2017, : 65 - 66
  • [4] Dual-execution mode processor architecture
    Akanda, Md. Musfiquzzaman
    Abderazek, Ben A.
    Sowa, Masahiro
    [J]. JOURNAL OF SUPERCOMPUTING, 2008, 44 (02): : 103 - 125
  • [5] Dual-execution mode processor architecture
    Md. Musfiquzzaman Akanda
    Ben A. Abderazek
    Masahiro Sowa
    [J]. The Journal of Supercomputing, 2008, 44 : 103 - 125
  • [6] Power-efficient flexible processor architecture for embedded applications
    Vermeulen, F
    Catthoor, F
    Nachtergaele, L
    Verkest, D
    De Man, H
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2003, 11 (03) : 376 - 385
  • [7] A multithreaded architecture for the efficient execution of vector computations within a loop using status field
    Youn, SD
    Chung, KD
    [J]. 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 1996, : 343 - 350
  • [8] Efficient Loop Navigation for Symbolic Execution
    Obdrzalek, Jan
    Trtik, Marek
    [J]. AUTOMATED TECHNOLOGY FOR VERIFICATION AND ANALYSIS, 2011, 6996 : 453 - 462
  • [9] PARALLEL-LOOP-EXECUTION TECHNOLOGY FOR IMPLEMENTATION ON VECTOR PROCESSOR
    LUKINOVA, OV
    [J]. CYBERNETICS AND SYSTEMS ANALYSIS, 1993, 29 (02) : 247 - 249
  • [10] LPA A first approach to the loop processor architecture
    Garcia, Alejandro
    Santana, Oliverio J.
    Fernandez, Enrique
    Medina, Pedro
    Valero, Mateo
    [J]. HIGH PERFORMANCE EMBEDDED ARCHITECTURES AND COMPILERS, 2008, 4917 : 273 - +