Parallel prefix (scan) algorithms for MPI

被引:0
|
作者
Sanders, Peter
Traeff, Jesper Larsson
机构
[1] Univ Karlsruhe, D-76131 Karlsruhe, Germany
[2] NEC Europe Ltd, C&C Res Labs, D-53757 St Augustin, Germany
关键词
cluster of SMPs; collective communication; MPI implementation; prefix sum; pipelining;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe and experimentally compare four theoretically well-known algorithms for the parallel prefix operation (scan, in MPI terms), and give a presumably novel, doubly-pipelined implementation of the in-order binary tree parallel prefix algorithm. Bidirectional interconnects can benefit from this implementation. We present results from a 32 node AMD Cluster with Myrinet 2000 and a 72-node SX-8 parallel vector system. The doubly-pipelined algorithm is more than a factor two faster than the straight-forward binomial-tree algorithm found in many MPI implementations. However, due to its small constant factors the simple, linear pipeline algorithm is preferable for systems with a moderate number of processors. We also discuss adapting the algorithms to clusters of SMP nodes.
引用
收藏
页码:49 / 57
页数:9
相关论文
共 50 条
  • [31] A parallel Poisson generator using parallel prefix
    Lu, TC
    Hou, YS
    Chen, RJ
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1996, 31 (03) : 33 - 42
  • [32] A taxonomy of parallel prefix networks
    Harris, D
    CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 2213 - 2217
  • [33] Parallel prefix adder design
    Beaumont-Smith, A
    Lim, CC
    ARITH-15 2001: 15TH SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2001, : 218 - 225
  • [34] Asynchronous parallel prefix computation
    Manohar, R
    Tierno, JA
    IEEE TRANSACTIONS ON COMPUTERS, 1998, 47 (11) : 1244 - 1252
  • [35] NONASSOCIATIVE PARALLEL PREFIX COMPUTATION
    CHEN, RJ
    HOU, YS
    INFORMATION PROCESSING LETTERS, 1992, 44 (02) : 91 - 94
  • [36] PARALLEL PREFIX AND DATA ASSOCIATION
    DAUM, FE
    SIGNAL AND DATA PROCESSING OF SMALL TARGETS 1989, 1989, 1096 : 174 - 186
  • [37] Minimal Parallel Prefix Circuits
    Sergeev, I. S.
    MOSCOW UNIVERSITY MATHEMATICS BULLETIN, 2011, 66 (05) : 215 - 218
  • [38] PROBABILISTIC PARALLEL PREFIX COMPUTATION
    REIF, JH
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1993, 26 (01) : 101 - 110
  • [39] Jacobi parallel iteration algorithms based on MPI and Taurus high performance computing system
    Zhang H.-L.
    Zhang M.
    Wang J.
    Ye X.-C.
    Wang W.-Q.
    Zhu Y.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2019, 49 (02): : 606 - 613
  • [40] Prefix Sequence: Optimization of Parallel Prefix Adders using Simulated Annealing
    Moto, Takayuki
    Kaneko, Mineo
    2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,