Correcting errors in message passing systems

被引:0
|
作者
Pedersen, JB [1 ]
Wagner, A [1 ]
机构
[1] Univ British Columbia, Vancouver, BC V5Z 1M9, Canada
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present an algorithm for correcting communication errors using delivered and undelivered messages. It is used to suggest corrective measures to remove errors introduced by typographical errors in message passing systems like PVM and MPI. The paper focuses on the validity of the algorithm by proving that for a nontrivial number of errors the algorithm can suggest changes to correct the errors. The algorithm has been implemented as a tool in Millipede (Multi Level Interactive Parallel Debugger), which is a support environment developed to assist programmers to debug message passing programs at different abstraction levels.
引用
收藏
页码:122 / 137
页数:16
相关论文
共 50 条
  • [31] An optimisation of allreduce communication in message-passing systems
    Jocksch, Andreas
    Ohana, Noé
    Lanti, Emmanuel
    Koutsaniti, Eirini
    Karakasis, Vasileios
    Villard, Laurent
    [J]. Parallel Computing, 2021, 107
  • [32] Communication Complexity of Consensus in Anonymous Message Passing Systems
    Fusco, Emanuele G.
    Pelc, Andrzej
    [J]. FUNDAMENTA INFORMATICAE, 2015, 137 (03) : 305 - 322
  • [33] Snap-Stabilization in Message-Passing Systems
    Delaet, Sylvie
    Devismes, Stephane
    Nesterenko, Mikhail
    Tixeuil, Sebastien
    [J]. DISTRIBUTED COMPUTING AND NETWORKING, 2009, 5408 : 281 - +
  • [34] An optimisation of allreduce communication in message-passing systems
    Jocksch, Andreas
    Ohana, Noe
    Lanti, Emmanuel
    Koutsaniti, Eirini
    Karakasis, Vasileios
    Villard, Laurent
    [J]. PARALLEL COMPUTING, 2021, 107
  • [35] SHARING MEMORY ROBUSTLY IN MESSAGE-PASSING SYSTEMS
    ATTIYA, H
    BARNOY, A
    DOLEV, D
    [J]. JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY, 1995, 42 (01): : 124 - 142
  • [36] Adaptive message passing environment for wafer scale systems
    Blight, David C.
    McLeod, Robert D.
    [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 1993, 1 (04) : 559 - 562
  • [37] Unifying stabilization and termination in message-passing systems
    Arora, A
    Nesterenko, M
    [J]. 21ST INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, PROCEEDINGS, 2001, : 99 - 106
  • [38] Snap-Stabilization in Message-Passing Systems
    Delaet, Sylvie
    Devismes, Stephane
    Nesterenko, Mikhail
    Tixeuil, Sebastien
    [J]. PODC'08: PROCEEDINGS OF THE 27TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2008, : 443 - 443
  • [39] Frequent pattern mining on message passing multiprocessor systems
    Javed, A
    Khokhar, A
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2004, 16 (03) : 321 - 334
  • [40] Scheduling of Elastic Message Passing Applications on HPC Systems
    Lina, Debolina Halder
    Ghafoor, Sheikh
    Hines, Thomas
    [J]. JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, JSSPP 2022, 2023, 13592 : 172 - 191