AN IMPROVED KUBERNETES SCHEDULING ALGORITHM FOR DEEP LEARNING PLATFORM

被引:3
|
作者
Shi Huaxin [1 ]
Gu Xiaofeng [1 ]
Kuang Ping [1 ]
Huang Hongyu [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Software Engn, Chengdu 610054, Peoples R China
关键词
Kubernetes; Docker; Deep Learning;
D O I
10.1109/ICCWAMTIP51612.2020.9317317
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most existing deep learning platforms only focus on helping users to start task training quickly, but they tend to ignore the application scenario of multi-team collaboration using one resource pool. In this paper, we propose an improved scheduling algorithm oriented to a multi-tenant model, in which team users are modeled as virtual clusters and cluster load will be monitored regularly. We apply the optimized Kubernetes scheduling algorithm to the Docker-based deep learning platform, our method can ensure the load balance and meet the needs of users.
引用
收藏
页码:113 / 116
页数:4
相关论文
共 50 条
  • [1] Voda: A GPU Scheduling Platform for Elastic Deep Learning in Kubernetes Clusters
    Hsieh, Tsung-Tso
    Lee, Che-Rung
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING, IC2E, 2023, : 131 - 140
  • [2] Speculative Container Scheduling for Deep Learning Applications in a Kubernetes Cluster
    Mao, Ying
    Fu, Yuqi
    Zheng, Wenjia
    Cheng, Long
    Liu, Qingzhi
    Tao, Dingwen
    [J]. IEEE SYSTEMS JOURNAL, 2022, 16 (03): : 3770 - 3781
  • [3] Deep Q learning cloud task scheduling algorithm based on improved exploration strategy
    Cheng, Chenyu
    Li, Gang
    Fan, Jiaqing
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (4-5) : 2095 - 2107
  • [4] Extending the Kubernetes Platform with Network-Aware Scheduling Capabilities
    Marchese, Angelo
    Tomarchio, Orazio
    [J]. SERVICE-ORIENTED COMPUTING (ICSOC 2022), 2022, 13740 : 465 - 480
  • [5] DRAGON: A Dynamic Scheduling and Scaling Controller for Managing Distributed Deep Learning Jobs in Kubernetes Cluster
    Lin, Chan-Yi
    Yeh, Ting-An
    Chou, Jerry
    [J]. CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2019, : 569 - 577
  • [6] Optimization of Task-Scheduling Strategy in Edge Kubernetes Clusters Based on Deep Reinforcement Learning
    Wang, Xin
    Zhao, Kai
    Qin, Bin
    [J]. MATHEMATICS, 2023, 11 (20)
  • [7] Design and Implementation of Kubernetes enabled Federated Learning Platform
    Kim, Jingyeom
    Kim, Doyeon
    Lee, Joohyung
    [J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 410 - 412
  • [8] A Customized Kubernetes Scheduling Algorithm to Improve Resource Utilization of Nodes
    Ning, An
    [J]. 2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 588 - 591
  • [9] Improved Genetic Algorithm for Flowshop Scheduling with Learning Effect
    Huang Minmei
    Luo Ronggui
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INNOVATION & MANAGEMENT, VOLS I AND II, 2008, : 1272 - 1276
  • [10] Learning hybrid differential evolution algorithm for the platform scheduling problem
    Wu X.
    Zhang Y.
    [J]. Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2022, 28 (11): : 3464 - 3478