Locality-Aware Parallel Process Mapping for Multi-Core HPC Systems

被引:14
|
作者
Hursey, Joshua [1 ]
Squyres, Jeffrey M. [1 ]
Dontje, Terry [1 ]
机构
[1] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
关键词
Process Affinity; Locality; NUMA; MPI; Resource Management;
D O I
10.1109/CLUSTER.2011.59
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
High Performance Computing (HPC) systems are composed of servers containing an ever-increasing number of cores. With such high processor core counts, non-uniform memory access (NUMA) architectures are almost universally used to reduce inter-processor and memory communication bottlenecks by distributing processors and memory throughout a server-internal networking topology. Application studies have shown that the tuning of processes placement in a server's NUMA networking topology to the application can have a dramatic impact on performance. The performance implications are magnified when running a parallel job across multiple server nodes, especially with large scale HPC applications. This paper presents the Locality-Aware Mapping Algorithm (LAMA) for distributing the individual processes of a parallel application across processing resources in an HPC system, paying particular attention to the internal server NUMA topologies. The algorithm is able to support both homogeneous and heterogeneous hardware systems, and dynamically adapts to the available hardware and user-specified process layout at run-time. As implemented in Open MPI, the LAMA provides 362,880 mapping permutations and is able to naturally scale out to additional hardware resources as they become available in future architectures.
引用
收藏
页码:527 / 531
页数:5
相关论文
共 50 条
  • [1] SLAW: A Scalable Locality-aware Adaptive Work-stealing Scheduler for Multi-core Systems
    Guo, Yi
    Zhao, Jisheng
    Cave, Vincent
    Sarkar, Vivek
    PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, : 341 - 342
  • [2] SLAW: A Scalable Locality-aware Adaptive Work-stealing Scheduler for Multi-core Systems
    Guo, Yi
    Zhao, Jisheng
    Cave, Vincent
    Sarkar, Vivek
    ACM SIGPLAN NOTICES, 2010, 45 (05) : 341 - 342
  • [3] A Locality-Aware, Energy-Efficient Cache Design for Large-Scale Multi-Core Systems
    Alshegaifi, Abdulrahman
    Huang, Chun-Hsi
    IEEE 2018 INTERNATIONAL CONGRESS ON CYBERMATICS / 2018 IEEE CONFERENCES ON INTERNET OF THINGS, GREEN COMPUTING AND COMMUNICATIONS, CYBER, PHYSICAL AND SOCIAL COMPUTING, SMART DATA, BLOCKCHAIN, COMPUTER AND INFORMATION TECHNOLOGY, 2018, : 497 - 502
  • [4] Locality-aware Partitioning in Parallel Database Systems
    Zamanian, Erfan
    Binnig, Carsten
    Salama, Abdallah
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 17 - 30
  • [5] LAWS: Locality-Aware Work-Stealing for Multi-socket Multi-core Architectures
    Chen, Quan
    Guo, Minyi
    Guan, Haibing
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 3 - 12
  • [6] Locality-Aware Mapping of Nested Parallel Patterns on GPUs
    Lee, HyoukJoong
    Brown, Kevin J.
    Sujeeth, Arvind K.
    Rompf, Tiark
    Olukotun, Kunle
    2014 47TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2014, : 63 - 74
  • [7] Multi-core Aware Process Mapping and its Impact on Communication Overhead of Parallel Applications
    Rodrigues, Eduardo R.
    Madruga, Felipe L.
    Navaux, Philippe O. A.
    Panetta, Jairo
    ISCC: 2009 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, 2009, : 810 - 816
  • [8] Locality-Aware Mapping and Scheduling for Multicores
    Ding, Wei
    Zhang, Yuanrui
    Kandemir, Mahmut
    Srinivas, Jithendra
    Yedlapalli, Praveen
    PROCEEDINGS OF THE 2013 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2013, : 335 - 346
  • [9] On the Overhead of Topology Discovery for Locality-aware Scheduling in HPC
    Goglin, Brice
    2017 25TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2017), 2017, : 186 - 190
  • [10] Locality-aware task scheduling for homogeneous parallel computing systems
    Muhammad Khurram Bhatti
    Isil Oz
    Sarah Amin
    Maria Mushtaq
    Umer Farooq
    Konstantin Popov
    Mats Brorsson
    Computing, 2018, 100 : 557 - 595