DRS: A deep reinforcement learning enhanced Kubernetes scheduler for microservice-based system

被引:1
|
作者
Jian, Zhaolong [1 ]
Xie, Xueshuo [1 ,2 ]
Fang, Yaozheng [1 ]
Jiang, Yibing [1 ]
Lu, Ye [3 ]
Dash, Ankan [4 ]
Li, Tao [1 ,2 ]
Wang, Guiling [4 ]
机构
[1] Nankai Univ, Coll Comp Sci, Tianjin, Peoples R China
[2] Chinese Acad Sci, State Key Lab Comp Architecture, Inst Comp Technol, Beijing, Peoples R China
[3] Nankai Univ, Coll Cyber Sci, Tianjin, Peoples R China
[4] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ USA
来源
SOFTWARE-PRACTICE & EXPERIENCE | 2024年 / 54卷 / 10期
关键词
deep reinforcement learning; Kubernetes scheduler; microservice scheduling; resource awareness;
D O I
10.1002/spe.3284
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, Kubernetes is widely used to manage and schedule the resources of microservices in cloud-native distributed applications, as the most famous container orchestration framework. However, Kubernetes preferentially schedules microservices to nodes with rich and balanced CPU and memory resources on a single node. The native scheduler of Kubernetes, called Kube-scheduler, may cause resource fragmentation and decrease resource utilization. In this paper, we propose a deep reinforcement learning enhanced Kubernetes scheduler named DRS. We initially frame the Kubernetes scheduling problem as a Markov decision process with intricately designed state, action, and reward structures in an effort to increase resource usage and decrease load imbalance. Then, we design and implement DRS mointor to perceive six parameters concerning resource utilization and create a thorough picture of all available resources globally. Finally, DRS can automatically learn the scheduling policy through interaction with the Kubernetes cluster, without relying on expert knowledge about workload and cluster status. We implement a prototype of DRS in a Kubernetes cluster with five nodes and evaluate its performance. Experimental results highlight that DRS overcomes the shortcomings of Kube-scheduler and achieves the expected scheduling target with three workloads. With only 3.27% CPU overhead and 0.648% communication delay, DRS outperforms Kube-scheduler by 27.29% in terms of resource utilization and reduces load imbalance by 2.90 times on average.
引用
收藏
页码:2102 / 2126
页数:25
相关论文
共 50 条
  • [1] RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning
    Huang, Jiaming
    Xiao, Chuming
    Wu, Weigang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E 2020), 2020, : 116 - 123
  • [2] Lihonga - a Microservice-based Virtual Learning Environment
    Kapembe, Samuel Stallin
    Quenum, Jose
    [J]. 2018 IEEE 18TH INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT 2018), 2018, : 98 - 100
  • [3] Microservice-Based Architecture for an Energy Management System
    Lyu, Zhongliang
    Wei, Hua
    Bai, Xiaoqing
    Lian, Chunjie
    [J]. IEEE SYSTEMS JOURNAL, 2020, 14 (04): : 5061 - 5072
  • [4] DScaler: A Horizontal Autoscaler of Microservice Based on Deep Reinforcement Learning
    Xiao, Zhijiao
    Hu, Song
    [J]. 2022 23RD ASIA-PACIFIC NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM (APNOMS 2022), 2022, : 121 - 126
  • [5] Efficient Microservice Deployment in Kubernetes Multi-Clusters through Reinforcement Learning
    Santos, Jose
    Zaccarini, Mattia
    Poltronieri, Filippo
    Tortonesi, Mauro
    Stefanelli, Cesare
    Di Cicco, Nicola
    de Turck, Filip
    [J]. PROCEEDINGS OF 2024 IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, NOMS 2024, 2024,
  • [6] Adversarial Attacks in a Deep Reinforcement Learning based Cluster Scheduler
    Zhang, Shaojun
    Wang, Chen
    Zomaya, Albert Y.
    [J]. 2020 IEEE 28TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2020), 2020, : 1 - 8
  • [7] A Deep-Reinforcement-Learning-Based Scheduler for FPGA HLS
    Chen, Hongzheng
    Shen, Minghua
    [J]. 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
  • [8] Architecture of a microservice-based flight management system simulation
    Li Q.
    Fan Y.
    Li D.
    Jiang X.
    Liu E.
    Chen J.
    [J]. 1600, Tsinghua University (60): : 589 - 596
  • [9] Microservice-based cloud robotics system for intelligent space
    Xia, Chongkun
    Zhang, Yunzhou
    Wang, Lei
    Coleman, Sonya
    Liu, Yanbo
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2018, 110 : 139 - 150
  • [10] Sim-DRS: a similarity-based dynamic resource scheduling algorithm for microservice-based web systems
    Li Y.
    Li T.
    Shen P.
    Hao L.
    Liu W.
    Wang S.
    Song Y.
    Bao L.
    [J]. Liu, Wenjing (wjliu97@stu.xidian.edu.cn), 1600, PeerJ Inc. (07):