Task-Parallel Programming on NUMA Architectures

被引:0
|
作者
Terboven, Christian [1 ]
Schmidl, Dirk [1 ]
Cramer, Tim [1 ]
Mey, Dieter An [1 ]
机构
[1] Rhein Westfal TH Aachen, JARA, Aachen, Germany
来源
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The multicore era has led to a renaissance of shared memory parallel programming models. Moreover, the introduction of task-level parallelization raises the level of abstraction compared to thread-centric expression of parallelism. However, tasks might exhibit poor performance on NUMA systems if locality cannot be controlled and non-local data is accessed. This work investigates various approaches to express task-parallelism using the OpenMP tasking model, from a programmer's point of view. We describe and compare task creation strategies and devise methods to preserve locality on NUMA architectures while optimizing the degree of parallelism. Our proposals are evaluated on reasonably large NUMA systems with both important application kernels as well as real-world simulation codes.
引用
收藏
页码:638 / 649
页数:12
相关论文
共 50 条
  • [41] Extending High-Level Synthesis for Task-Parallel Programs
    Chi, Yuze
    Guo, Licheng
    Lau, Jason
    Choi, Young-kyu
    Wang, Jie
    Cong, Jason
    2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, : 204 - 213
  • [42] Task-Parallel LU Factorization of Hierarchical Matrices using OmpSs
    Aliaga, Jose I.
    Carratala-Saez, Rocio
    Quintana-Orti, Enrique S.
    Krimann, Ronald
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 1148 - 1157
  • [43] Parallel Programming for Heterogeneous Architectures
    Krammer, Bettina
    Mix, Hartmut
    Geimer, Markus
    PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 731 - 732
  • [44] PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs
    Khatti, Moazin
    Tian, Xingyu
    Sedigh Baroughi, Ahmad
    Raj Baranwal, Akhil
    Chi, Yuze
    Guo, Licheng
    Cong, Jason
    Fang, Zhenman
    ACM Transactions on Reconfigurable Technology and Systems, 2024, 17 (03)
  • [45] Efficient lock-step synchronization in task-parallel languages
    Utture, Akshay
    Nandivada, V. Krishna
    SOFTWARE-PRACTICE & EXPERIENCE, 2019, 49 (09): : 1379 - 1401
  • [46] Pipeline pattern in an object-oriented, task-parallel environment
    Chow, Jonathan
    Giacaman, Nasser
    Sinnen, Oliver
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (05): : 1273 - 1291
  • [47] Global Dead-Block Management for Task-Parallel Programs
    Manivannan, Madhavan
    Pericas, Miquel
    Papaefstathiou, Vassilis
    Stenstrom, Per
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (03)
  • [48] Automatic Command Queue Scheduling for Task-Parallel Workloads in OpenCL
    Aji, Ashwin M.
    Pena, Antonio J.
    Balaji, Pavan
    Feng, Wu-chun
    2015 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING - CLUSTER 2015, 2015, : 42 - 51
  • [49] MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL
    Aji, Ashwin M.
    Pena, Antonio J.
    Balaji, Pavan
    Feng, Wu-chun
    PARALLEL COMPUTING, 2016, 58 : 37 - 55
  • [50] TaskStream: Accelerating Task-Parallel Workloads by Recovering Program Structure
    Dadu, Vidushi
    Nowatzki, Tony
    ASPLOS '22: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2022, : 1 - 13