Architectural support for scalable speculative parallelization in shared-memory multiprocessors

被引:0
|
作者
Cintra, M [1 ]
Martínez, JF [1 ]
Torrellas, J [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on single-chip multiprocessors (CMPs), whose effectiveness is necessarily limited by their small size. Very few schemes have attempted this technique in the context of scalable shared-memory systems. In this paper, we present and evaluate a new hardware scheme for scalable speculative parallelization. This design needs relatively simple hardware and is efficiently integrated into a cache-coherent NUMA system. We have designed the scheme in a hierarchical manner that largely abstracts away the internals of the node. We effectively utilize a speculative CMP as the building block for our scheme. Simulations show that the architecture proposed delivers good speedups at a modest hardware cost. For a set of important non-analyzable scientific loops, we report average speedups of 4.2 for 16 processors. We show that support for per-word speculative state is required by our applications, or else the performance suffers greatly.
引用
收藏
页码:13 / 24
页数:12
相关论文
共 50 条
  • [1] Parallelization of benchmarks for scalable shared-memory multiprocessors
    Paek, Y
    Navarro, A
    Zapata, E
    Padua, D
    [J]. 1998 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 1998, : 401 - 408
  • [2] Architectural support for parallel reductions in scalable shared-memory multiprocessors
    Garzarán, MJ
    Prvulovic, M
    Zhang, Y
    Jula, A
    Yu, H
    Rauchwerger, L
    Torrellas, J
    [J]. 2001 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PROCEEDINGS, 2001, : 243 - 254
  • [3] Hardware for speculative run-time parallelization in distributed shared-memory multiprocessors
    Zhang, Y
    Rauchwerger, L
    Torrellas, J
    [J]. 1998 FOURTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 1998, : 162 - 173
  • [4] Architectural trends for shared-memory multiprocessors
    Stenstrom, P
    [J]. THIRTIETH HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOL 1: SOFTWARE TECHNOLOGY AND ARCHITECTURE, 1997, : 732 - 733
  • [5] SCALABLE CACHE COHERENCE FOR SHARED-MEMORY MULTIPROCESSORS
    THAPAR, M
    DELAGI, BA
    FLYNN, MJ
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 591 : 1 - 12
  • [6] ALGORITHMS FOR SCALABLE SYNCHRONIZATION ON SHARED-MEMORY MULTIPROCESSORS
    MELLORCRUMMEY, JM
    SCOTT, ML
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1991, 9 (01): : 21 - 65
  • [7] Data forwarding in scalable shared-memory multiprocessors
    Koufaty, DA
    Chen, XF
    Poulsen, DK
    Torrellas, J
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (12) : 1250 - 1264
  • [8] Compiler directed parallelization of loops in scale for shared-memory multiprocessors
    Johnson, GS
    Sethumadhavan, S
    [J]. COMPUTATIONAL SCIENCE - ICCS 2003, PT III, PROCEEDINGS, 2003, 2659 : 946 - 955
  • [9] Fast synchronization on shared-memory multiprocessors: An architectural approach
    Fang, Z
    Zhang, LX
    Carter, JB
    Cheng, LQ
    Parker, M
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (10) : 1158 - 1170
  • [10] COOPERATIVE SHARED-MEMORY - SOFTWARE AND HARDWARE FOR SCALABLE MULTIPROCESSORS
    HILL, MD
    LARUS, JR
    REINHARDT, SK
    WOOD, DA
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1993, 11 (04): : 300 - 318