Enabling Fine-Grained OpenMP Tasking on Tightly-Coupled Shared Memory Clusters

被引:0
|
作者
Burgio, Paolo [1 ]
Tagliavini, Giuseppe [1 ]
Marongiu, Andrea [1 ]
Benini, Luca [1 ]
机构
[1] Univ Bologna, DEIS, Viale Risorgimento 2, I-40136 Bologna, Italy
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cluster-based architectures are increasingly being adopted to design embedded many-cores. These platforms can deliver very high peak performance within a contained power envelope, provided that programmers can make effective use the available parallel cores. This is becoming an extremely difficult task, as embedded applications are growing in complexity and exhibit irregular and dynamic parallelism. The OpenMP tasking extensions represent a powerful abstraction to capture this form of parallelism. However, efficiently supporting it on cluster-based embedded SoCs is not easy, because the fine-grained parallel workload present in embedded applications can not tolerate high memory and run-time overheads. In this paper we present our design of the runtime support layer to OpenMP tasking for an embedded shared memory cluster, identifying key aspects to achieving performance and discussing important architectural support to removing major bottlenecks.
引用
收藏
页码:1504 / 1509
页数:6
相关论文
共 50 条
  • [41] Fine-Grained Classification via Categorical Memory Networks
    Deng, Weijian
    Marsh, Joshua
    Gould, Stephen
    Zheng, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4186 - 4196
  • [42] Fine-Grained MPI plus OpenMP Plasma Simulations: Communication Overlap with Dependent Tasks
    Richard, Jerome
    Latu, Guillaume
    Bigot, Julien
    Gautier, Thierry
    EURO-PAR 2019: PARALLEL PROCESSING, 2019, 11725 : 419 - 433
  • [43] Outlier detection for fine-grained load balancing in database clusters
    Chen, Jin
    Soundararajan, Gokul
    Mihailescu, Madalin
    Amza, Cristiana
    2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2, 2007, : 404 - +
  • [44] Enabling Fine-Grained Finger Gesture Recognition on Commodity WiFi Devices
    Tan, Sheng
    Yang, Jie
    Chen, Yingying
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (08) : 2789 - 2802
  • [45] Enabling Efficient Fine-Grained DRAM Activations with Interleaved I/O
    Zhang, Chao
    Guo, Xiaochen
    2017 IEEE/ACM INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2017,
  • [46] Guoguo: Enabling Fine-Grained Smartphone Localization via Acoustic Anchors
    Liu, Kaikai
    Liu, Xinxin
    Li, Xiaolin
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2016, 15 (05) : 1144 - 1156
  • [47] Tightly-coupled Convolutional Neural Network with Spatial-temporal Memory for Text Classification
    Wang, Shiyao
    Deng, Zhidong
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 2370 - 2376
  • [48] A Highly Efficient, Thread-Safe Software Cache Implementation for Tightly-Coupled Multicore Clusters
    Pinto, Christian
    Benini, Luca
    PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13), 2013, : 281 - 288
  • [49] Improving Resilience to Timing Errors by Exposing Variability Effects to Software in Tightly-Coupled Processor Clusters
    Rahimi, Abbas
    Cesarini, Daniele
    Marongiu, Andrea
    Gupta, Rajesh K.
    Benini, Luca
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2014, 4 (02) : 216 - 229
  • [50] Efficient fine-grained shared buffer management for multiple OpenCL devices
    Chang-qing Xun
    Dong Chen
    Qiang Lan
    Chun-yuan Zhang
    Journal of Zhejiang University SCIENCE C, 2013, 14 : 859 - 872