Parallel prefix (scan) algorithms for MPI

被引:0
|
作者
Sanders, Peter
Traeff, Jesper Larsson
机构
[1] Univ Karlsruhe, D-76131 Karlsruhe, Germany
[2] NEC Europe Ltd, C&C Res Labs, D-53757 St Augustin, Germany
关键词
cluster of SMPs; collective communication; MPI implementation; prefix sum; pipelining;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe and experimentally compare four theoretically well-known algorithms for the parallel prefix operation (scan, in MPI terms), and give a presumably novel, doubly-pipelined implementation of the in-order binary tree parallel prefix algorithm. Bidirectional interconnects can benefit from this implementation. We present results from a 32 node AMD Cluster with Myrinet 2000 and a 72-node SX-8 parallel vector system. The doubly-pipelined algorithm is more than a factor two faster than the straight-forward binomial-tree algorithm found in many MPI implementations. However, due to its small constant factors the simple, linear pipeline algorithm is preferable for systems with a moderate number of processors. We also discuss adapting the algorithms to clusters of SMP nodes.
引用
收藏
页码:49 / 57
页数:9
相关论文
共 50 条
  • [21] Implementation of parallel and distributed genetic algorithms by using MPI on network of workstations
    Xiong, SW
    Chu, WJ
    Guo, JL
    COMPUTER SCIENCE AND TECHNOLOGY IN NEW CENTURY, 2001, : 449 - 453
  • [22] QuOp_MPI: A framework for parallel simulation of quantum variational algorithms
    Matwiejew, Edric
    Wang, Jingbo B.
    JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 62
  • [23] ON PARALLEL SCAN-CONVERSION ALGORITHMS FOR TRANSPUTER NETWORKS
    BEZ, HE
    PARKS, L
    JOURNAL OF MICROCOMPUTER APPLICATIONS, 1990, 13 (01): : 43 - 55
  • [24] Efficient parallel prefix algorithms on fully connected message-passing computers
    Lin, YC
    Lin, CM
    3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 1996, : 316 - 321
  • [25] Revisiting parallel cyclic reduction and parallel prefix-based. algorithms for block tridiagonal systems of equations
    Seal, Sudip K.
    Perumalla, Kalyan S.
    Hirshman, Steven P.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (02) : 273 - 280
  • [26] Parallel prefix computation
    Saxena, Sanjeev
    Bhatt, P.C.P.
    Prasad, V.C.
    Parallel processing letters, 1994, 4 (04) : 429 - 436
  • [27] THE POWER OF PARALLEL PREFIX
    KRUSKAL, CP
    RUDOLPH, L
    SNIR, M
    IEEE TRANSACTIONS ON COMPUTERS, 1985, 34 (10) : 965 - 968
  • [28] PARALLEL PREFIX COMPUTATION
    LADNER, RE
    FISCHER, MJ
    JOURNAL OF THE ACM, 1980, 27 (04) : 831 - 838
  • [29] Parallel genetic algorithms (PGAs): Master slave paradigm approach using MPI
    Ismail, MA
    E-TECH 2004, 2004, : 83 - 87
  • [30] Work-Efficient Parallel Algorithms for Accurate Floating-Point Prefix Sums
    Fraser, Sean
    Xu, Helen
    Leiserson, Charles E.
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,