Locality-Aware Parallel Process Mapping for Multi-Core HPC Systems

被引:14
|
作者
Hursey, Joshua [1 ]
Squyres, Jeffrey M. [1 ]
Dontje, Terry [1 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
关键词
Process Affinity; Locality; NUMA; MPI; Resource Management;
D O I
10.1109/CLUSTER.2011.59
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High Performance Computing (HPC) systems are composed of servers containing an ever-increasing number of cores. With such high processor core counts, non-uniform memory access (NUMA) architectures are almost universally used to reduce inter-processor and memory communication bottlenecks by distributing processors and memory throughout a server-internal networking topology. Application studies have shown that the tuning of processes placement in a server's NUMA networking topology to the application can have a dramatic impact on performance. The performance implications are magnified when running a parallel job across multiple server nodes, especially with large scale HPC applications. This paper presents the Locality-Aware Mapping Algorithm (LAMA) for distributing the individual processes of a parallel application across processing resources in an HPC system, paying particular attention to the internal server NUMA topologies. The algorithm is able to support both homogeneous and heterogeneous hardware systems, and dynamically adapts to the available hardware and user-specified process layout at run-time. As implemented in Open MPI, the LAMA provides 362,880 mapping permutations and is able to naturally scale out to additional hardware resources as they become available in future architectures.
引用
收藏
页码:527 / 531
页数:5
相关论文
共 50 条
  • [31] Voltage Droop Aware Task Mapping for Multi-Core Systems with On-Chip Voltage Regulator
    Siu, Wing Oi
    Ng, Chak Sing
    Mak, Terrence
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [32] Parallel and Distributed Simulation of networked Multi-Core Systems
    Wehner, Philipp
    Goehringer, Diana
    2014 INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP (SOC), 2014,
  • [33] A Parallel Packet Processing Method On Multi-Core Systems
    Li, Yunchun
    Qiao, Xinxin
    2011 TENTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2011, : 78 - 81
  • [34] Locality-aware fountain codes for massive distributed storage systems
    Okpotse, Toritseju
    Yousefi, Shahram
    2015 IEEE 14TH CANADIAN WORKSHOP ON INFORMATION THEORY (CWIT), 2015, : 18 - 21
  • [35] A Locality-Aware Compression Scheme for Highly Reliable Embedded Systems
    Hong, Juhyung
    Kim, Jeongbin
    Han, Sangwoo
    Chung, Eui-Young
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (03) : 453 - 465
  • [36] Enhancing Content Distribution Performance of Locality-aware BitTorrent Systems
    Li, Zhenyu
    Xie, Gaogang
    2010 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE GLOBECOM 2010, 2010,
  • [37] Asymmetry-aware load balancing for parallel applications in single-ISA multi-core systems
    Eunsung Kim
    Hyeonsang Eom
    Heon Y. Yeom
    Journal of Zhejiang University SCIENCE C, 2012, 13 : 413 - 427
  • [38] Asymmetry-aware load balancing for parallel applications in single-ISA multi-core systems
    Eunsung KIM
    Hyeonsang EOM
    Heon Y. YEOM
    JournalofZhejiangUniversity-ScienceC(Computers&Electronics), 2012, 13 (06) : 413 - 427
  • [39] Asymmetry-aware load balancing for parallel applications in single-ISA multi-core systems
    Kim, Eunsung
    Eom, Hyeonsang
    Yeom, Heon Y.
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2012, 13 (06): : 413 - 427
  • [40] Asymmetry-aware load balancing for parallel applications in single-ISA multi-core systems
    Eunsung KIM
    Hyeonsang EOM
    Heon Y. YEOM
    Frontiers of Information Technology & Electronic Engineering, 2012, (06) : 413 - 427