Task-Parallel Programming on NUMA Architectures

被引:0
|
作者
Terboven, Christian [1 ]
Schmidl, Dirk [1 ]
Cramer, Tim [1 ]
Mey, Dieter An [1 ]
机构
[1] Rhein Westfal TH Aachen, JARA, Aachen, Germany
来源
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The multicore era has led to a renaissance of shared memory parallel programming models. Moreover, the introduction of task-level parallelization raises the level of abstraction compared to thread-centric expression of parallelism. However, tasks might exhibit poor performance on NUMA systems if locality cannot be controlled and non-local data is accessed. This work investigates various approaches to express task-parallelism using the OpenMP tasking model, from a programmer's point of view. We describe and compare task creation strategies and devise methods to preserve locality on NUMA architectures while optimizing the degree of parallelism. Our proposals are evaluated on reasonably large NUMA systems with both important application kernels as well as real-world simulation codes.
引用
下载
收藏
页码:638 / 649
页数:12
相关论文
共 50 条
  • [1] Task-Parallel Programming with Constrained Parallelism
    Huang, Tsung-Wei
    Hwang, Leslie
    [J]. 2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
  • [2] An Efficient Task-Parallel Pipeline Programming Framework
    Chiu, Cheng-Hsiang
    Xiong, Zhicheng
    Guo, Zizheng
    Huang, Tsung-Wei
    Lin, Yibo
    [J]. THE PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION, HPC ASIA 2024, 2024, : 95 - 106
  • [3] Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures
    Nishioka, Yusuke
    Taura, Kenjiro
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1178 - 1184
  • [4] Energy efficiency optimization of task-parallel codes on asymmetric architectures
    Costero, Luis
    Igual, Francisco D.
    Olcoz, Katzalin
    Tirado, Francisco
    [J]. 2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 402 - 409
  • [5] Performance analysis of four parallel programming models on NUMA architectures
    Mohamed, AS
    Cantonnet, F
    [J]. PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS, 2003, : 119 - 125
  • [6] Wait-free Hyperobjects for Task-parallel Programming Systems
    Wimmer, Martin
    [J]. IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 803 - 812
  • [7] DeepSparse: A Task-parallel Framework for Sparse Solvers on Deep Memory Architectures
    Afibuzzaman, Md
    Rabbi, Fazlay
    Ozkaya, M. Yusuf
    Aktulga, Hasan Metin
    Catalyurek, Umit, V
    [J]. 2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 373 - 382
  • [8] NUMA-aware Scheduling and Memory Allocation for data-flow task-parallel Applications
    Drebes, Andi
    Pop, Antoniu
    Heydemann, Karine
    Drach, Nathalie
    Cohen, Albert
    [J]. ACM SIGPLAN NOTICES, 2016, 51 (08) : 391 - 392
  • [9] An Evaluation of Task-Parallel Frameworks for Sparse Solvers on Multicore and Manycore CPU Architectures
    Alperen, Abdullah
    Afibuzzaman, Md
    Rabbi, Fazlay
    Ozkaya, M. Yusuf
    Catalyurek, Umit
    Aktulga, Hasan Metin
    [J]. 50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
  • [10] Performance oriented programming for NUMA architectures
    Chapman, B
    Patil, A
    Prabhakar, A
    [J]. OPENMP SHARED MEMORY PARALLEL PROGRAMMING, PROCEEDINGS, 2001, 2104 : 137 - 154