Large-Scale Experiment for Topology-Aware Resource Management

被引:0
|
作者
Georgiou, Yiannis [1 ]
Mercier, Guillaume [2 ]
Villiermet, Adele [3 ]
机构
[1] Atos Bull, Grenoble, France
[2] Bordeaux INP, Talence, France
[3] Inria Bordeaux Sud Ouest, Talence, France
关键词
Resource management; Job allocation; Topology-aware placement; Scheduling; SLURM; PLACEMENT;
D O I
10.1007/978-3-319-75178-8_15
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A Resource and Job Management System (RJMS) is a crucial system software part of the HPC stack. It is responsible for efficiently delivering computing power to applications in supercomputing environments and its main intelligence relies on resource selection techniques to find the most adapted resources to schedule the users' jobs. In [8], we introduced a new topology-aware resource selection algorithm to determine the best choice among the available nodes of the platform based on their position in the network and on application behaviour (expressed as a communication matrix). We did integrate this algorithm as a plugin in SLURM and validated it with several optimization schemes by making comparisons with the default SLURM algorithm. This paper presents further experiments with regard to this selection process.
引用
收藏
页码:179 / 186
页数:8
相关论文
共 50 条
  • [21] QTMS: A quadratic time complexity topology-aware process mapping method for large-scale parallel applications on shared HPC system
    Yan, Baicheng
    Xiao, Limin
    Qin, Guangjun
    Yang, Zhang
    Dong, Bin
    Yu, Haonan
    Wu, Hongyu
    PARALLEL COMPUTING, 2020, 94-95
  • [22] An Opportunistic Resource Sharing and Topology-Aware Mapping Framework for Virtual Networks
    Zhang, Sheng
    Qian, Zhuzhong
    Wu, Jie
    Lu, Sanglu
    2012 PROCEEDINGS IEEE INFOCOM, 2012, : 2408 - 2416
  • [23] TopAwaRe: Topology-Aware Registration
    Nielsen, Rune Kok
    Darkner, Sune
    Feragen, Aasa
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT II, 2019, 11765 : 364 - 372
  • [24] A Topology-Aware Improvement on Chord
    Zhou Xiaofan
    Yang Xudong
    Wang Zhiqian
    2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2009, : 637 - 640
  • [25] A Topology-Aware Random Walk
    Yu, InKwan
    Newman, Richard
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2012, E95B (03) : 995 - 998
  • [26] Topology-aware Simulated Annealing
    Kerrache, Said
    Benhidour, Hafida
    2014 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, MODELLING AND SIMULATION, 2014, : 19 - 24
  • [27] TARA: Topology-aware resource adaptation to alleviate congestion in sensor networks
    Kang, Jaewon
    Zhang, Yanyong
    Nath, Badri
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2007, 18 (07) : 919 - 931
  • [28] Topology-aware job mapping
    Georgiou, Yiannis
    Jeannot, Emmanuel
    Mercier, Guillaume
    Villiermet, Adele
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2018, 32 (01): : 14 - 27
  • [29] A Hierarchical, Topology-aware Approach to Dynamic Data Centre Management
    Keller, Gaston
    Tighe, Michael
    Lutfiyya, Hanan
    Bauer, Michael
    2014 IEEE NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (NOMS), 2014,
  • [30] TARMan: Topology-Aware Reliability Management for Softwarized Network Systems
    Gebre-Amlak, Haymanot
    Banala, Goutham
    Song, Sejun
    Choi, Baek-Young
    Choi, Taesang
    Zhu, Henry
    2017 23RD IEEE INTERNATIONAL SYMPOSIUM ON LOCAL AND METROPOLITAN AREA NETWORKS (LANMAN), 2017,