A uGNI-based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect

被引:9
|
作者
Sun, Yanhua [1 ]
Zheng, Gengbin [1 ]
Kale, Laximant V. [1 ]
Jones, Terry R. [2 ]
Olson, Ryan [3 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[2] Oak Ridge Natl Lab, Comp Sci & Math Div, Oak Ridge, TN USA
[3] Cary Inc, Seattle, WA USA
关键词
Cray XE/XT; Gemini Interconnect; Asynchronous message-driven; Low Level Runtime System;
D O I
10.1109/IPDPS.2012.127
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Gemini, the network for the new Cray XE/XK systems, features low latency, high bandwidth and strong scalability. Its hardware support for remote direct memory access enables efficient implementation of the global address space programming languages. Although the user Generic Network Interface (uGNI) provides a low-level interface for Gemini with support to the message-passing programming model (MPI), it remains challenging to port alternative programming models with scalable performance. CHARM++ is an object-oriented message-driven programming model. Its applications have been shown to scale up to the full Jaguar Cray XT machine. In this paper, we present an implementation of this programming model on uGNI for the Cray XE/XK systems. Several techniques are presented to exploit the uGNI capabilites by reducing memory copy and registration overhead, taking advantage of the persistent communication, and improving intra-node communication. Our micro-benchmark results demonstrate that the uGNI-based runtime system outperforms the MPI-based implementation by up to 50% in terms of message latency. For communication intensive applications such as N-Queens, this implementation scales up to 15, 360 cores of a Cray XE6 machine and is 70% faster than the MPI-based implementation. In molecular dynamics application NAMD, the performance is also considerably improved by as much as 18%.
引用
收藏
页码:751 / 762
页数:12
相关论文
共 2 条
  • [1] Enabling Efficient Inter-node Message Passing and Remote Memory Access via a uGNI-based Lightweight Network Substrate for Cray Interconnects
    Wickramasinghe, Udayanga
    Lumsdaine, Andrew
    [J]. 2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 578 - 588
  • [2] Control Strategy of Photovoltaic Power system based on message-driven
    Wang, Shudong
    Ding, Ting
    Qiu, Jinliang
    [J]. 2018 FIRST INTERNATIONAL CONFERENCE ON ENVIRONMENT PREVENTION AND POLLUTION CONTROL TECHNOLOGY (EPPCT 2018), 2018, 199