CSMT: Simultaneous Multithreading for Clustered VLIW Processors

被引:3
|
作者
Gupta, Manoj [1 ]
Sanchez, Fermin [1 ]
Llosa, Josep [1 ]
机构
[1] Univ Politecn Cataluna, Dept Arquitectura Computadors, ES-08034 Barcelona, Spain
关键词
ILP; VLIW architectures; clustered VLIW architectures; multithreaded processors; simultaneous multithreading; ARCHITECTURE; PERFORMANCE;
D O I
10.1109/TC.2009.96
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Simultaneous MultiThreading (SMT) is a well-known technique that improves resource utilization by exploiting thread-level parallelism at the instruction grain level. However, implementing SMT for VLIWs requires complex structures, which is contrary to the VLIW philosophy of hardware simplicity. In this paper, we propose Cluster-level Simultaneous MultiThreading (CSMT) to allow some degree of SMT in clustered VLIW processors with low hardware cost and complexity. CSMT considers the set of operations that execute simultaneously in a given cluster as the assignment unit. To minimize cluster conflicts between threads, a very simple hardware-based cluster renaming mechanism is proposed. The hardware required to implement CSMT is cheap, realistic, and practical for a clustered VLIW processor. An analysis of the hardware required to implement CSMT shows that it is quite scalable, with up to eight threads easily supported at low hardware cost. The experimental results show that CSMT significantly improves performance when compared with other multithreading approaches suited for VLIW. For instance, with four threads, CSMT shows an average speedup of 110 percent over a single-thread VLIW architecture and 40 percent over Interleaved MultiThreading (IMT). In some cases, speedup can be as high as 225 percent over single-thread architecture and 84 percent over IMT.
引用
收藏
页码:385 / 399
页数:15
相关论文
共 50 条
  • [1] Cluster-Level Simultaneous Multithreading for VLIW Processors
    Gupta, Manoj
    Sanchez, Fermin
    Llosa, Josep
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, VOLS, 1 AND 2, 2007, : 121 - 128
  • [2] Clustered microarchitecture simultaneous multithreading
    Lee, SW
    Gaudiot, JL
    [J]. EURO-PAR 2003 PARALLEL PROCESSING, PROCEEDINGS, 2003, 2790 : 576 - 585
  • [3] Simultaneous multithreading trace processors
    Wang, KF
    Ji, ZZ
    Hu, MZ
    [J]. ADVANCED PARALLEL PROCESSING TECHNOLOGIES, PROCEEDINGS, 2003, 2834 : 96 - 103
  • [4] Merge logic for clustered multithreaded VLIW processors
    Gupta, Manoj
    Sanchez, Fermin
    Llosa, Josep
    [J]. DSD 2007: 10TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN ARCHITECTURES, METHODS AND TOOLS, PROCEEDINGS, 2007, : 353 - 360
  • [5] Evaluation of speed and area of clustered VLIW processors
    Terechko, A
    Garg, M
    Corporaal, H
    [J]. 18TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, PROCEEDINGS: POWER AWARE DESIGN OF VLSI SYSTEMS, 2005, : 557 - 563
  • [6] On the Power Management of Simultaneous Multithreading Processors
    Youssef, Ahmed
    Zahran, Mohamed
    Anis, Mohab
    Elmasry, Mohamed
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2010, 18 (08) : 1243 - 1248
  • [7] Network Applications on Simultaneous Multithreading Processors
    Yi, Kyueun
    Gaudiot, Jean-Luc
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2010, 59 (09) : 1200 - 1209
  • [8] Distributed data cache designs for clustered VLIW processors
    Gibert, E
    Sánchez, J
    González, A
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2005, 54 (10) : 1227 - 1241
  • [9] An Efficient Heuristic for Instruction Scheduling on Clustered VLIW Processors
    Zhang, Xuemeng
    Wu, Hui
    Xue, Jingling
    [J]. PROCEEDINGS OF THE PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES '11), 2011, : 35 - 44
  • [10] Simultaneous multithreading trace processors: Improving trace processors performance
    Wang, KF
    Ji, ZZ
    Hu, MZ
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2006, 30 (02) : 102 - 116