Parallel prefix (scan) algorithms for MPI

被引:0
|
作者
Sanders, Peter
Traeff, Jesper Larsson
机构
[1] Univ Karlsruhe, D-76131 Karlsruhe, Germany
[2] NEC Europe Ltd, C&C Res Labs, D-53757 St Augustin, Germany
关键词
cluster of SMPs; collective communication; MPI implementation; prefix sum; pipelining;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe and experimentally compare four theoretically well-known algorithms for the parallel prefix operation (scan, in MPI terms), and give a presumably novel, doubly-pipelined implementation of the in-order binary tree parallel prefix algorithm. Bidirectional interconnects can benefit from this implementation. We present results from a 32 node AMD Cluster with Myrinet 2000 and a 72-node SX-8 parallel vector system. The doubly-pipelined algorithm is more than a factor two faster than the straight-forward binomial-tree algorithm found in many MPI implementations. However, due to its small constant factors the simple, linear pipeline algorithm is preferable for systems with a moderate number of processors. We also discuss adapting the algorithms to clusters of SMP nodes.
引用
收藏
页码:49 / 57
页数:9
相关论文
共 50 条
  • [1] New Parallel Prefix Algorithms
    Lin, Yen-Chun
    Hung, Li-Ling
    [J]. AIC '09: PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS AND COMMUNICATIONS: RECENT ADVANCES IN APPLIED INFORMAT AND COMMUNICATIONS, 2009, : 204 - +
  • [2] Efficient parallel prefix algorithms on multicomputers
    Lin, YC
    Lin, CM
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2000, 16 (01) : 41 - 64
  • [3] PARALLEL ALGORITHMS FOR COMPUTING LINKED LIST PREFIX
    HAN, Y
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1989, 6 (03) : 537 - 557
  • [4] A programming methodology for designing parallel prefix algorithms
    Fan, MH
    Huang, CH
    Chung, YC
    Liu, JS
    Lee, JZ
    [J]. PROCEEDINGS OF THE 2001 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2001, : 463 - 470
  • [5] Parallel algorithms for computing linked list prefix
    [J]. Han, Yijie, 1600, (06):
  • [6] Two families of parallel prefix algorithms for multicomputers
    Hung, Li-Ling
    Lin, Yen-Chun
    [J]. NEW ASPECTS OF TELECOMMUNICATIONS AND INFORMATICS, 2008, : 37 - 43
  • [7] A FAMILY OF PARALLEL PREFIX ALGORITHMS EMBEDDED IN NETWORKS
    TAKESUE, M
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1993, 4 (10) : 1179 - 1184
  • [8] A family of computation-efficient parallel prefix algorithms
    Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, 43 Keelung Road, Taipei 106, Taiwan
    [J]. WSEAS Trans. Comput., 2006, 12 (3060-3066):
  • [9] Optimal and efficient parallel algorithms for summing and prefix summing
    Santos, EE
    [J]. EIGHTH IEEE SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1996, : 504 - 511
  • [10] Performance Analysis of Parallel Sorting Algorithms using MPI
    Durad, Muhammad Hanif
    Akhtar, Muhammad Naveed
    Irfan-ul-Haq
    [J]. PROCEEDINGS OF 2014 12TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY, 2014, : 202 - 207