MATE, a Unified Model for Communication-Tolerant Scientific Applications

被引:0
|
作者
Martin, Sergio M. [1 ]
Baden, Scott B. [1 ,2 ]
机构
[1] Univ Calif San Diego, La Jolla, CA 92093 USA
[2] Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
关键词
Scientific computing; Communication-Tolerance; SPMD;
D O I
10.1007/978-3-030-34627-0_10
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present MATE, a model for developing communication-tolerant scientific applications. MATE employs a combination of mechanisms to reduce or hide the cost of network and intra-node data movement. While previous approaches have been proposed to reduce both sources of communication overhead separately, the contribution of MATE is demonstrating the symbiotic effect of reducing both forms of data movement taken together. Furthermore, MATE provides these benefits within a single unified model, as opposed to hybrid (e.g., MPI+X) approaches. We demonstrate MATE's effectiveness in reducing the cost of communication in three scientific computing motifs on up to 32k cores of the NERSC Cori Phase I supercomputer.
引用
收藏
页码:120 / 137
页数:18
相关论文
共 50 条
  • [1] Scientific communication and the Unified Laboratory sequence
    Silverstein, TP
    Hudak, NJ
    Chapple, FH
    Goodney, DE
    Brink, CP
    Whitehead, JP
    [J]. JOURNAL OF CHEMICAL EDUCATION, 1997, 74 (02) : 150 - 152
  • [2] FAULT TOLERANT COMPUTING AND RELIABLE COMMUNICATION - A UNIFIED APPROACH
    TRACHTENBERG, EA
    [J]. INFORMATION AND COMPUTATION, 1988, 79 (03) : 257 - 279
  • [3] The τ-model:: A unified communication cost model
    Tsai, YJ
    McKinley, PK
    [J]. INTERNATIONAL SOCIETY FOR COMPUTERS AND THEIR APPLICATIONS 10TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 1997, : 496 - 503
  • [4] Survey on Fault Tolerant Techniques in Scientific Applications
    Singh, Shefali
    Chana, Inderveer
    [J]. PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2018, : 618 - 624
  • [5] A Model of Scientific Communication
    Andrews, Isaiah
    Shapiro, Jesse M.
    [J]. ECONOMETRICA, 2021, 89 (05) : 2117 - 2142
  • [6] Research on Applications of Unified Communication in Large Enterprise
    Huang, Xuebin
    Zhao, Chun
    Zheng, Wei
    Zhou, Jinrong
    [J]. 2013 INTERNATIONAL CONFERENCE ON MANAGEMENT (ICM 2013), 2013, : 224 - 232
  • [7] Toucan - A Translator for Communication Tolerant MPI Applications
    Martin, Sergio M.
    Berger, Marsha J.
    Baden, Scott B.
    [J]. 2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 998 - 1007
  • [8] A UNIFIED RELIABILITY MODEL FOR FAULT-TOLERANT COMPUTERS
    NG, YW
    AVIZIENIS, AA
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1980, 29 (11) : 1002 - 1011
  • [9] Secure Communication Guarantees for Diverse Extended-Reality Applications: A Unified Statistical Security Model
    Xiao, Yuquan
    Du, Qinghe
    Cheng, Wenchi
    Lu, Nan
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2023, 17 (05) : 1007 - 1021