MulticoreBSP for C: A High-Performance Library for Shared-Memory Parallel Programming

被引:0
|
作者
A. N. Yzelman
R. H. Bisseling
D. Roose
K. Meerbergen
机构
[1] Flanders ExaScience Lab (Intel Labs Europe),Department of Computer Science
[2] KU Leuven,Department of Mathematics
[3] Utrecht University,undefined
关键词
High-performance computing; Bulk synchronous parallel ; Shared-memory parallel programming; Software library; Fast Fourier transform; Sparse matrix–vector multiplication;
D O I
暂无
中图分类号
学科分类号
摘要
The bulk synchronous parallel (BSP) model, as well as parallel programming interfaces based on BSP, classically target distributed-memory parallel architectures. In earlier work, Yzelman and Bisseling designed a MulticoreBSP for Java library specifically for shared-memory architectures. In the present article, we further investigate this concept and introduce the new high-performance MulticoreBSP for C library. Among other features, this library supports nested BSP runs. We show that existing BSP software performs well regardless whether it runs on distributed-memory or shared-memory architectures, and show that applications in MulticoreBSP can attain high-performance results. The paper details implementing the Fast Fourier Transform and the sparse matrix–vector multiplication in BSP, both of which outperform state-of-the-art implementations written in other shared-memory parallel programming interfaces. We furthermore study the applicability of BSP when working on highly non-uniform memory access architectures.
引用
收藏
页码:619 / 642
页数:23
相关论文
共 50 条
  • [1] MulticoreBSP for C: A High-Performance Library for Shared-Memory Parallel Programming
    Yzelman, A. N.
    Bisseling, R. H.
    Roose, D.
    Meerbergen, K.
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2014, 42 (04) : 619 - 642
  • [2] SHARED-MEMORY PARALLEL PROGRAMMING IN C++
    BECK, B
    [J]. IEEE SOFTWARE, 1990, 7 (04) : 38 - 48
  • [3] A Methodology Approach to Compare Performance of Parallel Programming Models for Shared-Memory Architectures
    Utrera, Gladys
    Gil, Marisa
    Martorell, Xavier
    [J]. NUMERICAL COMPUTATIONS: THEORY AND ALGORITHMS, PT I, 2020, 11973 : 318 - 325
  • [4] A high-performance MPI implementation on a shared-memory vector supercomputer
    Gropp, W
    Lusk, E
    [J]. PARALLEL COMPUTING, 1997, 22 (11) : 1513 - 1526
  • [5] HIGH-PERFORMANCE UNIVERSAL HASHING, WITH APPLICATIONS TO SHARED-MEMORY SIMULATIONS
    DIETZFELBINGER, M
    HEIDE, FMAD
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 594 : 250 - 269
  • [6] LIBMF: A Library for Parallel Matrix Factorization in Shared-memory Systems
    Chin, Wei-Sheng
    Yuan, Bo-Wen
    Yang, Meng-Yuan
    Zhuang, Yong
    Juan, Yu-Chin
    Lin, Chih-Jen
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [7] GEMs: shared-memory parallel programming for Node.js
    Bonetta D.
    Salucci L.
    Marr S.
    Binder W.
    [J]. ACM SIGPLAN Notices, 2016, 51 (10): : 531 - 547
  • [8] Performance evaluation of or-parallel logic programming systems on distributed shared-memory architectures
    Calegario, VM
    Dutra, ID
    [J]. EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 1484 - 1491
  • [9] The NAS Parallel Benchmarks for evaluating C plus plus parallel programming frameworks on shared-memory architectures
    Loff, Junior
    Griebler, Dalvan
    Mencagli, Gabriele
    Araujo, Gabriell
    Torquati, Massimo
    Danelutto, Marco
    Fernandes, Luiz Gustavo
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 125 : 743 - 757
  • [10] A Comparative Study and Evaluation of Parallel Programming Models for Shared-Memory Parallel Architectures
    Luis Miguel Sanchez
    Javier Fernandez
    Rafael Sotomayor
    Soledad Escolar
    J. Daniel. Garcia
    [J]. New Generation Computing, 2013, 31 : 139 - 161