Proactive Container Auto-scaling for Cloud Native Machine Learning Services

被引:12
|
作者
Buchaca, David [1 ]
Berral, Josep LLuis [1 ]
Wang, Chen [2 ]
Youssef, Alaa [2 ]
机构
[1] Univ Politecn Cataluna, Barcelona Supercomp Ctr, Barcelona, Spain
[2] IBM Res, Yorktown Hts, NY USA
关键词
Cloud Native; Machine Learning Service; Container; Auto-scaling; WORKLOAD; PREDICTION;
D O I
10.1109/CLOUD49709.2020.00070
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Understanding the resource usage behaviors of the ever-increasing machine learning workloads are critical to cloud providers offering Machine Learning (ML) services. Capable of auto-scaling resources for customer workloads can significantly improve resource utilization, thus greatly reducing the cost. Here we leverage the AI4DL framework [1] to characterize workload and discover resource consumption phases. We advance the existing technology to an incremental phase discovery method that applies to more general types of ML workload for both training and inference. We use a time-window MultiLayer Perceptron (MLP) to predict phases in containers with different types of workload. Then, we propose a predictive vertical auto-scaling policy to resize the container dynamically according to phase predictions. We evaluate our predictive auto-scaling policies on 561 long-running containers with multiple types of ML workloads. The predictive policy can reduce up to 38% of allocated CPU compared to the default resource provisioning policies by developers. By comparing our predictive policies with commonly used reactive auto-scaling policies, we find that they can accurately predict sudden phase transitions (with an F1-score of 0.92) and significantly reduce the number of out-of-memory errors (350 vs. 20). Besides, we show that the predictive auto-scaling policy maintains the number of resizing operations close to the best reactive policies.
引用
收藏
页码:475 / 479
页数:5
相关论文
共 50 条
  • [1] Predictive Container Auto-Scaling for Cloud-Native Applications
    Zhao, Hanqing
    Lim, Hyunwoo
    Hanif, Muhammad
    Lee, Choonhwa
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1280 - 1282
  • [2] PASCAL: An architecture for proactive auto-scaling of distributed services
    Lombardi, Federico
    Muti, Andrea
    Aniello, Leonardo
    Baldoni, Roberto
    Bonomi, Silvia
    Querzoni, Leonardo
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 98 : 342 - 361
  • [3] Proactive Auto-Scaling for Service Function Chains in Cloud Computing Based on Deep Learning
    Taha, Mohammad Bany
    Sanjalawe, Yousef
    Al-Daraiseh, Ahmad
    Fraihat, Salam
    Al-E'mari, Salam R.
    [J]. IEEE ACCESS, 2024, 12 : 38575 - 38593
  • [4] Adaptive Resource Provisioning and Auto-scaling for Cloud Native Software
    Pozdniakova, Olesia
    Mazeika, Dalius
    Cholomskis, Aurimas
    [J]. INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2018, 2018, 920 : 113 - 129
  • [5] FLAS: A combination of proactive and reactive auto-scaling architecture for distributed services
    Ramperez, Victor
    Soriano, Javier
    Lizcano, David
    Lara, Juan A.
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 118 : 56 - 72
  • [6] Online machine learning for auto-scaling in the edge computing?
    da Silva, Thiago Pereira
    Neto, Aluizio Rocha
    Batista, Thais Vasconcelos
    Delicato, Flavia C.
    Pires, Paulo F.
    Lopes, Frederico
    [J]. PERVASIVE AND MOBILE COMPUTING, 2022, 87
  • [7] Proactive auto-scaling for cloud environments using temporal convolutional neural networks
    Golshani, Ehsan
    Ashtiani, Mehrdad
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 154 : 119 - 141
  • [8] An event-driven and lightweight proactive auto-scaling architecture for cloud applications
    Akash, Uttom
    Paul, Partha Protim
    Habib, Ahsan
    [J]. INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2023, 14 (05) : 539 - 551
  • [9] Auto-Scaling Approach for Cloud based Mobile Learning Applications
    Almutlaq, Amani Nasser
    Daadaa, Yassine
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (01) : 472 - 479
  • [10] Auto-Scaling with Apprenticeship Learning
    Hakimzadeh, Kamal
    Nicholson, Patrick K.
    Lugones, Diego
    [J]. PROCEEDINGS OF THE 2018 ACM SYMPOSIUM ON CLOUD COMPUTING (SOCC '18), 2018, : 512 - 512