A scheduling framework for large-scale, parallel, and topology-aware applications

被引:3
|
作者
Kravtsov, Valentin [1 ]
Bar, Pavel [1 ]
Carmeli, David [1 ]
Schuster, Assaf [1 ]
Swain, Martin [2 ]
机构
[1] Technion Israel Inst Technol, Dept Comp Sci, Technion, Haifa, Israel
[2] Univ Ulster, Syst Biol Res Grp, Coleraine BT52 1SA, Londonderry, North Ireland
关键词
QosCosGrid; Grid; Supercomputer; Scheduler; ALGORITHM;
D O I
10.1016/j.jpdc.2010.05.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Scheduling of large-scale, distributed topology-aware applications requires that not only the properties of the requested machines be considered, but also the properties of the machines' interconnections. This requirement severely complicates the scheduling process, as even a matching between a single multi-processor task and available machines in a single time slot becomes an NP-complete problem with no polynomial approximation. In this paper we propose a complete scheduling framework for multi-cluster, heterogeneous environments that provides, in practice, an efficient solution for the scheduling of topology-aware applications. The proposed framework is very flexible as it is composed of pluggable components and can be easily configured to support a variety of scheduling policies. We also describe three novel scheduling and coallocation algorithms that were developed and plugged into the framework. The proposed scheduling framework was integrated into the QosCosGrid(1) system, where it is used as the main decision-making module. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:983 / 992
页数:10
相关论文
共 50 条
  • [1] Topology-Aware Scheduling Framework for Microservice Applications in Cloud
    Li, Xin
    Zhou, Junsong
    Wei, Xin
    Li, Dawei
    Qian, Zhuzhong
    Wu, Jie
    Qin, Xiaolin
    Lu, Sanglu
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (05) : 1635 - 1649
  • [2] Topology-aware algorithms for large-scale communication
    Rodrigues, L
    Veríssimo, P
    [J]. ADVANCES IN DISTRIBUTED SYSTEMS: ADVANCED DISTRIBUTED COMPUTING: FROM ALGORITHMS TO SYSTEMS, 2000, 1752 : 127 - 156
  • [3] Large-Scale Experiment for Topology-Aware Resource Management
    Georgiou, Yiannis
    Mercier, Guillaume
    Villiermet, Adele
    [J]. EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 179 - 186
  • [4] Topology-Aware Mappings for Large-Scale Eigenvalue Problems
    Aktulga, Hasan Metin
    Yang, Chao
    Ng, Esmond G.
    Maris, Pieter
    Vary, James P.
    [J]. EURO-PAR 2012 PARALLEL PROCESSING, 2012, 7484 : 830 - 842
  • [5] Topology-aware Sparse Allreduce for Large-scale Deep Learning
    Thao Nguyen Truong
    Wahib, Mohamed
    Takano, Ryousei
    [J]. 2019 IEEE 38TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2019,
  • [6] LPMS: A Low-cost Topology-aware Process Mapping Method for Large-scale Parallel Applications on Shared HPC Systems
    Yan Baicheng
    Yang Zhang
    Xiao Limin
    Zhou Yi
    Wei Bing
    Song Yao
    [J]. PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPP 2019), 2019,
  • [7] QTMS: A quadratic time complexity topology-aware process mapping method for large-scale parallel applications on shared HPC system
    Yan, Baicheng
    Xiao, Limin
    Qin, Guangjun
    Yang, Zhang
    Dong, Bin
    Yu, Haonan
    Wu, Hongyu
    [J]. PARALLEL COMPUTING, 2020, 94-95
  • [8] Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers
    Tessier, Francois
    Malakar, Preeti
    Vishwanath, Venkatram
    Jeannot, Emmanuel
    Isaila, Florin
    [J]. PROCEEDINGS OF FIRST WORKSHOP ON OPTIMIZATION OF COMMUNICATION IN HPC RUNTIME SYSTEMS (COM-HPC 2016), 2016, : 73 - 81
  • [9] A Topology-Aware Adaptive Deployment Framework for Elastic Applications
    Keller, Matthias
    Peuster, Manuel
    Robbert, Christoph
    Karl, Holger
    [J]. 2013 17TH INTERNATIONAL CONFERENCE ON INTELLIGENCE IN NEXT GENERATION NETWORKS (ICIN), 2013, : 61 - 69
  • [10] Topology-Aware OpenMP Process Scheduling
    Thoman, Peter
    Moritsch, Hans
    Fahringer, Thomas
    [J]. BEYOND LOOP LEVEL PARALLELISM IN OPENMP: ACCELERATORS, TASKING AND MORE, PROCEEDINGS, 2010, 6132 : 96 - 108