DLL-conscious instruction fetch optimization for SMT processors

被引:0
|
作者
Mohamood, Fayez [1 ]
Ghosh, Mrinmoy [1 ]
Lee, Hsien-Hsin S. [1 ]
机构
[1] Georgia Tech Elect & Comp Engn, Sch ECE Georgia Tech, Atlanta, GA 30332 USA
关键词
Simultaneous multithreading; Dynamic linked libraries; Translation lookaside buffer; Caches;
D O I
10.1016/j.sysarc.2008.04.014
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Simultaneous multithreading (SMT) processors can issue Multiple instructions from distinct processes or threads in the same cycle. This technique effectively increases the overall throughput by keeping the pipeline resources more Occupied at the potential expense of reducing single thread performance due to resource sharing. In the software domain, in increasing number of dynamically linked libraries (DLL) are used by applications and operating systems, providing better flexibility and modularity, and enabling code sharing. It is observed that a Significant amount Of execution time in software today is spent in executing standard DLL instructions, that are shared among Multiple threads or processes. However, for an SMT processor with a virtually-indexed cache implementation, existing instruction fetching mechanisms can induce unnecessary false I-TLB and I-Cache misses caused by the DLL-based instructions that are intended to be shared. This problem is more prominent when multiple independent threads are executing Concurrently oil an SMT processor. In this work, we investigate a neglected form of contention between running threads in the I-TLB and Cache (including both VIVT and VIPT) due to DLLs. To address these shortcomings, we propose a system level technique involving a light-weight modification in the microarchitecture and the OS. By exploiting the nature of the DLLs in Our optimized system, we can reinstate the intended sharing of the DLLs in an SMT machine. Using Microsoft Windows based applications, our simulation results show that the optimized instruction fetching mechanism can reduce the number of DLL misses up to 5.5 times and improve the instruction cache hit rates by up to 62%, resulting in up to 30% DLL IPC improvements and up to 15% overall IPC improvements. (c) 2008 Elsevier B.V. All Lights reserved.
引用
收藏
页码:1089 / 1100
页数:12
相关论文
共 41 条
  • [21] A Fetch policy maximizing throughput and fairness for two-context SMT processors
    Sun, CX
    Tang, HW
    Zhang, MX
    ADVANCED PARALLEL PROCESSING TECHNOLOGIES, PROCEEDINGS, 2005, 3756 : 13 - 22
  • [22] An exploration of instruction fetch requirement in out-of-order superscalar processors
    Michaud, P
    Seznec, A
    Jourdan, S
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2001, 29 (01) : 35 - 58
  • [23] Energy-effective instruction fetch unit for wide issue processors
    Aragón, JL
    Veidenbaum, AV
    ADVANCES IN COMPUTER SYSTEMS ARCHITECTURE, PROCEEDINGS, 2005, 3740 : 15 - 27
  • [24] An Exploration of Instruction Fetch Requirement in Out-of-Order Superscalar Processors
    Pierre Michaud
    André Seznec
    Stéphan Jourdan
    International Journal of Parallel Programming, 2001, 29 : 35 - 58
  • [25] A Dynamic Resource Allocation Optimization for SMT Processors
    Chen, Hongzhou
    Ping, Lingdi
    Lu, Kuijun
    Jiang, Xiaoning
    INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 353 - +
  • [26] A fault-tolerant dynamic fetch policy for SMT processors in multi-bus environments
    Fechner, Bernhard
    PAR ELEC 2006: INTERNATIONAL SYMPOSIUM ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING, PROCEEDINGS, 2006, : 31 - 36
  • [27] Exploring instruction-fetch bandwidth requirement in wide-issue superscalar processors
    Michaud, Pierre
    Seznec, Andre
    Jourdan, Stephan
    Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT, 1999, : 2 - 10
  • [28] Adaptive instruction dispatching techniques for Simultaneous Multi-Threading (SMT) processors
    Debnath, Monobrata
    Lin, Wei-Ming
    John, Eugene
    COMPUTERS & ELECTRICAL ENGINEERING, 2012, 38 (06) : 1616 - 1626
  • [29] An operation rearrangement technique for power optimization in VLIW instruction fetch
    Shin, D
    Kim, J
    Chang, N
    DESIGN, AUTOMATION AND TEST IN EUROPE, CONFERENCE AND EXHIBITION 2001, PROCEEDINGS, 2001, : 809 - 809
  • [30] Shrinking L1 Instruction Caches to Improve Energy-Delay in SMT Embedded Processors
    Ferreron-Labari, Alexandra
    Ortin-Obon, Marta
    Suarez-Gracia, Dario
    Alastruey-Benede, Jesus
    Vinals-Yufera, Victor
    ARCHITECTURE OF COMPUTING SYSTEMS - ARCS 2013, 2013, 7767 : 256 - 267