Enabling Fine-Grained OpenMP Tasking on Tightly-Coupled Shared Memory Clusters

被引:0
|
作者
Burgio, Paolo [1 ]
Tagliavini, Giuseppe [1 ]
Marongiu, Andrea [1 ]
Benini, Luca [1 ]
机构
[1] Univ Bologna, DEIS, Viale Risorgimento 2, I-40136 Bologna, Italy
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cluster-based architectures are increasingly being adopted to design embedded many-cores. These platforms can deliver very high peak performance within a contained power envelope, provided that programmers can make effective use the available parallel cores. This is becoming an extremely difficult task, as embedded applications are growing in complexity and exhibit irregular and dynamic parallelism. The OpenMP tasking extensions represent a powerful abstraction to capture this form of parallelism. However, efficiently supporting it on cluster-based embedded SoCs is not easy, because the fine-grained parallel workload present in embedded applications can not tolerate high memory and run-time overheads. In this paper we present our design of the runtime support layer to OpenMP tasking for an embedded shared memory cluster, identifying key aspects to achieving performance and discussing important architectural support to removing major bottlenecks.
引用
收藏
页码:1504 / 1509
页数:6
相关论文
共 50 条
  • [1] Variation-tolerant OpenMP Tasking on Tightly-coupled Processor Clusters
    Rahimi, Abbas
    Marongiu, Andrea
    Burgio, Paolo
    Gupta, Rajesh K.
    Benini, Luca
    DESIGN, AUTOMATION & TEST IN EUROPE, 2013, : 541 - 546
  • [2] Tightly-Coupled Hardware Support to Dynamic Parallelism Acceleration in Embedded Shared Memory Clusters
    Burgio, Paolo
    Tagliavini, Giuseppe
    Conti, Francesco
    Marongiu, Andrea
    Benini, Luca
    2014 DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION (DATE), 2014,
  • [3] A tightly-coupled Hardware Controller to improve scalability and programmability of shared-memory heterogeneous clusters
    Burgio, Paolo
    Danilo, Robin
    Marongiu, Andrea
    Coussy, Philippe
    Benini, Luca
    2014 DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION (DATE), 2014,
  • [4] Parallel scheduler for a shared memory (tightly-coupled) multiprocessor system
    Sharma, G
    Gupta, B
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1998, 13 (04): : 241 - 247
  • [5] Architecture Support for Tightly-Coupled Multi-Core Clusters with Shared-Memory HW Accelerators
    Dehyadegari, Masoud
    Marongiu, Andrea
    Kakoee, Mohammad Reza
    Mohammadi, Siamak
    Yazdani, Naser
    Benini, Luca
    IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (08) : 2132 - 2144
  • [6] Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking
    Tagliavini, Giuseppe
    Cesarini, Daniele
    Marongiu, Andrea
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (09) : 2150 - 2163
  • [7] PERFORMANCE OF TIGHTLY-COUPLED SYSTEMS WITH SHARED CACHE
    CHAUDHRY, GM
    BEDI, JS
    MICROPROCESSING AND MICROPROGRAMMING, 1991, 29 (05): : 287 - 292
  • [8] Fine-Grained QoS Control via Tightly-Coupled Bandwidth Monitoring and Regulation for FPGA-Based Heterogeneous SoCs
    Valente, Giacomo
    Brilli, Gianluca
    Mascio, Tania Di
    Capotondi, Alessandro
    Burgio, Paolo
    Valente, Paolo
    Marongiu, Andrea
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2025, 36 (02) : 326 - 340
  • [9] Fine-Grained QoS Control via Tightly-Coupled Bandwidth Monitoring and Regulation for FPGA-based Heterogeneous SoCs
    Brilli, G.
    Valente, G.
    Capotondi, A.
    Burgio, P.
    Di Mascio, T.
    Valente, P.
    Marongiu, A.
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [10] HeteroSync: A Benchmark Suite for Fine-Grained Synchronization on Tightly Coupled GPUs
    Sinclair, Matthew D.
    Alsop, Johnathan
    Adve, Sarita V.
    PROCEEDINGS OF THE 2017 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2017, : 239 - 249