Deep and reinforcement learning for automated task scheduling in large-scale cloud computing systems

被引:78
|
作者
Rjoub, Gaith [1 ]
Bentahar, Jamal [1 ]
Wahab, Omar Abdel [2 ]
Bataineh, Ahmed Saleh [1 ]
机构
[1] Concordia Univ, Concordia Inst Informat Syst Engn, Sir George Williams Campus,1455 Maisonneuve Blvd, Montreal, PQ, Canada
[2] Univ Quebec Outaouais, Dept Comp Sci & Engn, Gatineau, PQ, Canada
来源
基金
加拿大自然科学与工程研究理事会;
关键词
cloud automation; deep learning; reinforcement learning; task scheduling; RESOURCE-ALLOCATION;
D O I
10.1002/cpe.5919
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Cloud computing is undeniably becoming the main computing and storage platform for today's major workloads. From Internet of things and Industry 4.0 workloads to big data analytics and decision-making jobs, cloud systems daily receive a massive number of tasks that need to be simultaneously and efficiently mapped onto the cloud resources. Therefore, deriving an appropriate task scheduling mechanism that can both minimize tasks' execution delay and cloud resources utilization is of prime importance. Recently, the concept of cloud automation has emerged to reduce the manual intervention and improve the resource management in large-scale cloud computing workloads. In this article, we capitalize on this concept and propose four deep and reinforcement learning-based scheduling approaches to automate the process of scheduling large-scale workloads onto cloud computing resources, while reducing both the resource consumption and task waiting time. These approaches are: reinforcement learning (RL), deep Q networks, recurrent neural network long short-term memory (RNN-LSTM), and deep reinforcement learning combined with LSTM (DRL-LSTM). Experiments conducted using real-world datasets from Google Cloud Platform revealed that DRL-LSTM outperforms the other three approaches. The experiments also showed that DRL-LSTM minimizes the CPU usage cost up to67%compared with the shortest job first (SJF), and up to35%compared with both the round robin (RR) and improved particle swarm optimization (PSO) approaches. Moreover, our DRL-LSTM solution decreases the RAM memory usage cost up to72%compared with the SJF, up to65%compared with the RR, and up to31.25%compared with the improved PSO.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] GARLSched: Generative adversarial deep reinforcement learning task scheduling optimization for large-scale high performance computing systems
    Li, Jingbo
    Zhang, Xingjun
    Wei, Jia
    Ji, Zeyu
    Wei, Zheng
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 135 : 259 - 269
  • [2] Deep reinforcement learning for scheduling in large-scale networked control systems
    Redder, Adrian
    Ramaswamy, Arunselvan
    Quevedo, Daniel E.
    [J]. IFAC PAPERSONLINE, 2019, 52 (20): : 333 - 338
  • [3] A novel deep reinforcement learning scheme for task scheduling in cloud computing
    K. Siddesha
    G. V. Jayaramaiah
    Chandrapal Singh
    [J]. Cluster Computing, 2022, 25 : 4171 - 4188
  • [4] A novel deep reinforcement learning scheme for task scheduling in cloud computing
    Siddesha, K.
    Jayaramaiah, G. V.
    Singh, Chandrapal
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (06): : 4171 - 4188
  • [5] Energy-aware task scheduling optimization with deep reinforcement learning for large-scale heterogeneous systems
    Jingbo Li
    Xingjun Zhang
    Zheng Wei
    Jia Wei
    Zeyu Ji
    [J]. CCF Transactions on High Performance Computing, 2021, 3 : 383 - 392
  • [6] Energy-aware task scheduling optimization with deep reinforcement learning for large-scale heterogeneous systems
    Li, Jingbo
    Zhang, Xingjun
    Wei, Zheng
    Wei, Jia
    Ji, Zeyu
    [J]. CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2021, 3 (04) : 383 - 392
  • [7] Reinforcement Learning-Based Intelligent Task Scheduling for Large-Scale IoT Systems
    Jin, Chenghou
    Han, Yusen
    Deng, Zhuo
    Chen, Ying
    Liu, Chengxia
    Huang, Jiwei
    [J]. Wireless Communications and Mobile Computing, 2023, 2023
  • [8] DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing
    Mangalampalli, Sudheer
    Karri, Ganesh Reddy
    Kumar, Mohit
    Khalaf, Osama Ibrahim
    Romero, Carlos Andres Tavera
    Sahib, GhaidaMuttashar Abdul
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (03) : 8359 - 8387
  • [9] DRLBTSA: Deep reinforcement learning based task-scheduling algorithm in cloud computing
    Sudheer Mangalampalli
    Ganesh Reddy Karri
    Mohit Kumar
    Osama Ibrahim Khalaf
    Carlos Andres Tavera Romero
    GhaidaMuttashar Abdul Sahib
    [J]. Multimedia Tools and Applications, 2024, 83 : 8359 - 8387
  • [10] Task Scheduling in Cloud Using Deep Reinforcement Learning
    Swarup, Shashank
    Shakshuki, Elhadi M.
    Yasar, Ansar
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 42 - 51