Reconstructing Out-of-Order Issue Queue

被引:2
|
作者
Jeong, Ipoom [1 ]
Lee, Jiwon [1 ]
Yoon, Myung Kuk [2 ]
Ro, Won Woo [1 ]
机构
[1] Yonsei Univ, Sch Elect & Elect Engn, Seoul, South Korea
[2] Ewha Womans Univ, Dept Comp Sci & Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Dynamic Scheduling; Data Dependence; Steering; INSTRUCTION; MICROARCHITECTURE; CORE;
D O I
10.1109/MICRO56248.2022.00023
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Out-of-order cores provide high performance at the cost of energy efficiency. Dynamic scheduling is one of the major contributors to this: generating highly optimized issue schedules considering both data dependences and underlying execution resources, but relying heavily on complex wakeup and select operations of an out-of-order issue queue (IQ). For decades, researchers have proposed several complexity-effective dynamic scheduling schemes by leveraging the energy efficiency of an in-order IQ. However, they are either costly or not capable of delivering sufficient performance to substitute for a conventional wide-issue out-of-order IQ. In this work, we revisit two previous designs: one classical dependence-based design and the other state-of-the-art readiness-based design. We observe that they are complementary to each other, and thus their synergistic integration has the potential to be a good alternative to an out-of-order IQ. We first combine these two designs, and further analyze the main architectural bottlenecks that incur the underutilization of aggregate issue capability, thereby limiting the exploitation of instruction-level and memory-level parallelisms: 1) memory dependences not exposed by the register-based dependence analysis and 2) wide and shallow nature of dynamic dependence chains due to the long-latency memory accesses. To this end, we propose Ballerino, a novel microarchitecture that performs balanced and cache-miss-tolerable dynamic scheduling via a complementary combination of cascaded and clustered in-order IQs. Ballerino is built upon three key functionalities: 1) speculatively filtering out ready-at-dispatch instructions, 2) eliminating wasteful wakeup operations via a simple steering technique leveraging the awareness of memory dependences, and 3) reacting to program phase changes by allowing different load-dependent chains to share a single IQ while guaranteeing their out-of-order issue. The net effect is minimal scheduling energy consumption per instruction while providing comparable scheduling performance to a fully out-of-order IQ. In our analysis, Ballerino achieves comparable performance to an 8-wide out-of-order core by using twelve in-order IQs, improving core-wide energy efficiency by 20%.
引用
收藏
页码:144 / 161
页数:18
相关论文
共 50 条
  • [31] Out-of-order Execution of Database Queries
    Goda, Kazuo
    Hayamizu, Yuto
    Yamada, Hiroyuki
    Kitsuregawa, Masaru
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 3489 - 3501
  • [32] Disjoint Out-of-Order Execution Processor
    Sharafeddine, Mageda
    Jothi, Komal
    Akkary, Haitham
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2012, 9 (03)
  • [33] Power-aware out-of-order issue logic in high-performance microprocessors
    Weinraub, Yehuda Sadeh
    Weiss, Shlomo
    MICROPROCESSORS AND MICROSYSTEMS, 2006, 30 (07) : 457 - 467
  • [34] Results on Out-of-Order Event Processing
    Fodor, Paul
    Anicic, Darko
    Rudolph, Sebastian
    PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES, 2011, 6539 : 220 - +
  • [35] Regional Out-of-Order Writes in Total Store Order
    Singh, Sawan
    Jimborean, Alexandra
    Ros, Alberto
    PACT '20: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2020, : 205 - 216
  • [36] Modeling out-of-order processors for WCET analysis
    Li, Xianfeng
    Roychoudhury, Abhik
    Mitra, Tulika
    REAL-TIME SYSTEMS, 2006, 34 (03) : 195 - 227
  • [37] Optimization Techniques for Verification of Out-of-Order ExecutionMachines
    Srinivasan, Sudarshan K.
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2010, 2010
  • [38] Addressing Out-of-order Issue of Congestion-aware Adaptive Routing in Subnet based NoC
    Das, Tuhin Subhra
    Ghosal, Prasun
    Nath, Arnab
    PROCEEDINGS OF THE 2019 IEEE REGION 10 CONFERENCE (TENCON 2019): TECHNOLOGY, KNOWLEDGE, AND SOCIETY, 2019, : 1584 - 1589
  • [39] Fast precise interrupt handling without associative searching in multiple out-of-order issue processors
    Nam, SJ
    Park, IC
    Kyung, CM
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1999, E82D (03) : 645 - 653
  • [40] Minimum register instruction sequencing to reduce register spills in out-of-order issue superscalar architectures
    Govindarajan, R
    Yang, HB
    Amaral, JN
    Zhang, CH
    Gao, GR
    IEEE TRANSACTIONS ON COMPUTERS, 2003, 52 (01) : 4 - 20