DLL-conscious instruction fetch optimization for SMT processors

被引：0

作者：

Mohamood, Fayez ^{[1
]}

Ghosh, Mrinmoy ^{[1
]}

Lee, Hsien-Hsin S. ^{[1
]}

机构：

[1] Georgia Tech Elect & Comp Engn, Sch ECE Georgia Tech, Atlanta, GA 30332 USA

来源：

JOURNAL OF SYSTEMS ARCHITECTURE | 2008年 / 54卷 / 12期

关键词：

Simultaneous multithreading; Dynamic linked libraries; Translation lookaside buffer; Caches;

D O I：

10.1016/j.sysarc.2008.04.014

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Simultaneous multithreading (SMT) processors can issue Multiple instructions from distinct processes or threads in the same cycle. This technique effectively increases the overall throughput by keeping the pipeline resources more Occupied at the potential expense of reducing single thread performance due to resource sharing. In the software domain, in increasing number of dynamically linked libraries (DLL) are used by applications and operating systems, providing better flexibility and modularity, and enabling code sharing. It is observed that a Significant amount Of execution time in software today is spent in executing standard DLL instructions, that are shared among Multiple threads or processes. However, for an SMT processor with a virtually-indexed cache implementation, existing instruction fetching mechanisms can induce unnecessary false I-TLB and I-Cache misses caused by the DLL-based instructions that are intended to be shared. This problem is more prominent when multiple independent threads are executing Concurrently oil an SMT processor. In this work, we investigate a neglected form of contention between running threads in the I-TLB and Cache (including both VIVT and VIPT) due to DLLs. To address these shortcomings, we propose a system level technique involving a light-weight modification in the microarchitecture and the OS. By exploiting the nature of the DLLs in Our optimized system, we can reinstate the intended sharing of the DLLs in an SMT machine. Using Microsoft Windows based applications, our simulation results show that the optimized instruction fetching mechanism can reduce the number of DLL misses up to 5.5 times and improve the instruction cache hit rates by up to 62%, resulting in up to 30% DLL IPC improvements and up to 15% overall IPC improvements. (c) 2008 Elsevier B.V. All Lights reserved.

引用

页码：1089 / 1100

页数：12

共 41 条

[1] Effective instruction fetch control mechanism for SMT processors
College of Computer Science, Inner Mongolia University, Huhhot 010021, China
不详
不详
Jisuanji Xuebao, 2006, 4 (535-543):
[2] Instantaneous IPC based instruction fetch policy for SMT processors
College of Computer Science, Inner Mongolia University, Huhhot 010021, China
不详
Jisuanji Xuebao/Chinese Journal of Computers, 2007, 30 (04): : 629 - 637
[3] A resource utilization based instruction fetch policy for SMT processors
Weng, Lichen
Liu, Chen
MICROPROCESSORS AND MICROSYSTEMS, 2015, 39 (01) : 1 - 10
[4] Achieving Predictable Performance in SMT Processors by Instruction Fetch Policy
Sun, Caixia
Wang, Yongwen
Xu, Jinbo
COMPUTER ENGINEERING AND TECHNOLOGY, NCCET 2013, 2013, 396 : 186 - 197
[5] Using instruction fetch policy to control performance of a thread in SMT processors
School of Computer Science, National University of Defense Technology, Changsha 410073, China
Jisuanji Xuebao, 2008, 2 (309-317):
[6] An instruction fetch policy handling L2 cache misses in SMT processors
Sun, Caixia
Tang, Hongwei
Zhang, Minxuan
EIGHTH INTERNATIONAL CONFERENCE ON HIGH-PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION, PROCEEDINGS, 2005, : 519 - 525
[7] Controlling performance of a time-critical thread in SMT processors by instruction fetch policy
Sun, Caixia
Tang, Hongwei
Zhang, Minxuan
SEVENTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES, PROCEEDINGS, 2006, : 217 - +
[8] Enhancing DCache warn fetch policy for SMT processors
Zhang, MX
Sun, CX
PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, 2005, 3758 : 216 - 223
[9] Instruction fetch mechanisms for multipath execution processors
Klauser, A
Grunwald, D
32ND ANNUAL INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, (MICRO-32), PROCEEDINGS, 1999, : 38 - 47
[10] Instruction set architecture to control instruction fetch on pipelined processors
Okamoto, S
Sowa, M
1997 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING, VOLS 1 AND 2: PACRIM 10 YEARS - 1987-1997, 1997, : 121 - 124

← 1 2 3 4 5 →