SYNCHRONIZATION AND COMMUNICATION COSTS OF LOOP PARTITIONING ON SHARED-MEMORY MULTIPROCESSOR SYSTEMS

被引:1
|
作者
GUPTA, R
机构
[1] Department of Computer Science, University of Pittsburgh, Pittsburgh, PA
关键词
COMMUNICATION; PARALLELIZING COMPILERS; PROGRAM DECOMPOSITION; RUN-TIME SCHEDULING; SHARED MEMORY MULTIPROCESSOR SYSTEMS; SYNCHRONIZATION;
D O I
10.1109/71.149968
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To exploit loop level parallelism on shared memory multiprocessor systems, loops are decomposed and their execution scheduled on different processors in parallel. This paper presents strategies for static loop decomposition and scheduling as well as compiler assisted run-time scheduling that take into account, in addition to the cost of performing operations, the overhead costs associated with a decomposition and schedule. An algorithm for static decomposition of multidimensional loops based upon the operation execution costs, communication costs, and synchronization costs is discussed. Following the decomposition of a program, synchronization instructions are introduced to ensure correct program execution. An algorithm for determining the explicit synchronization instructions that should be introduced in a program to ensure correct execution of the program with arbitrarily nested loops is presented. Techniques for reducing run-time scheduling, communication and synchronization costs due to self scheduling, a compiler assisted run-time scheduling technique, of multidimensional loops are also presented. Experiments performed on the Encore multiprocessor system demonstrate that the techniques developed can reduce overhead costs.
引用
收藏
页码:505 / 512
页数:8
相关论文
共 50 条
  • [1] SYNCHRONIZATION AND COMMUNICATION COSTS OF LOOP PARTITIONING ON SHARED-MEMORY MULTIPROCESSOR SYSTEMS
    GUPTA, R
    PROCEEDINGS OF THE 1989 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, VOL 2: SOFTWARE, 1989, : 23 - 30
  • [2] Queue structures for shared-memory multiprocessor systems
    Zhu, WP
    34TH ANNUAL SIMULATION SYMPOSIUM, PROCEEDINGS, 2001, : 99 - 106
  • [3] Parallelization of While Loops in Nested Loop Programs for Shared-Memory Multiprocessor Systems
    Geuns, Stefan J.
    Bekooij, Marco J. G.
    Bijlsma, Tjerk
    Corporaal, Henk
    2011 DESIGN, AUTOMATION & TEST IN EUROPE (DATE), 2011, : 697 - 702
  • [4] KERNEL-KERNEL COMMUNICATION IN A SHARED-MEMORY MULTIPROCESSOR
    CHAVES, EM
    DAS, PC
    LEBLANC, TJ
    MARSH, BD
    SCOTT, ML
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1993, 5 (03): : 171 - 191
  • [5] Reader-Writer Synchronization for Shared-Memory Multiprocessor Real-Time Systems
    Brandenburg, Bjoern B.
    Anderson, James H.
    PROCEEDINGS OF THE 21ST EUROMICRO CONFERENCE ON REAL-TIME SYSTEMS, 2009, : 184 - 193
  • [6] Cluster queue structure for shared-memory multiprocessor systems
    Zhu, W
    JOURNAL OF SUPERCOMPUTING, 2003, 25 (03): : 215 - 236
  • [7] Cluster queue structure for shared-memory multiprocessor systems
    Zhu, WP
    Liang, TY
    Shieh, CK
    1998 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 420 - 427
  • [8] Cluster Queue Structure for Shared-Memory Multiprocessor Systems
    W. Zhu
    The Journal of Supercomputing, 2003, 25 : 215 - 236
  • [9] Shared-memory synchronization
    Scott, Michael L.
    Synthesis Lectures on Computer Architecture, 2013, 23 : 1 - 220
  • [10] THE IMPACT OF PARALLEL LOOP SCHEDULING STRATEGIES ON PREFETCHING IN A SHARED-MEMORY MULTIPROCESSOR
    LILJA, DJ
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1994, 5 (06) : 573 - 584