The Power of Prediction: Microservice Auto Scaling via Workload Learning

被引:31
|
作者
Luo, Shutian [1 ,2 ,3 ,5 ]
Xu, Huanle [3 ,5 ]
Ye, Kejiang [1 ,5 ]
Xu, Guoyao [4 ]
Zhang, Liping [4 ]
Yang, Guodong [4 ]
Xu, Chengzhong [3 ,5 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
[2] Univ CAS, Beijing, Peoples R China
[3] Univ Macau, Zhuhai, Peoples R China
[4] Alibaba Grp, Hangzhou, Peoples R China
[5] Guangdong Hong Kong Macao Joint Lab Human Machine, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Microservices; Proactive Auto-scaler; Workload Uncertainty Learning;
D O I
10.1145/3542929.3563477
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When deploying microservices in production clusters, it is critical to automatically scale containers to improve cluster utilization and ensure service level agreements (SLA). Although reactive scaling approaches work well for monolithic architectures, they are not necessarily suitable for microservice frameworks due to the long delay caused by complex microservice call chains. In contrast, existing proactive approaches leverage end-to-end performance prediction for scaling, but cannot effectively handle microservice multiplexing and dynamic microservice dependencies. In this paper, we present Madu, a proactive microservice auto-scaler that scales containers based on predictions for individual microservices. Madu learns workload uncertainty to handle the highly dynamic dependency between microservices. Additionally, Madu adopts OS-level metrics to optimize resource usage while maintaining good control over scaling overhead. Experiments on large-scale deployments of microservices in Alibaba clusters show that the overall prediction accuracy of Madu can reach as high as 92.3% on average, which is 13% higher than the state-of-the-art approaches. Furthermore, experiments running real-world microservice benchmarks in a local cluster of 20 servers show that Madu can reduce the overall resource usage by 1.7x compared to reactive solutions, while reducing end-to-end service latency by 50%.
引用
收藏
页码:355 / 369
页数:15
相关论文
共 50 条
  • [31] A Deep Learning Approach for VM Workload Prediction in the Cloud
    Qiu, Feng
    Zhang, Bin
    Guo, Jun
    2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 319 - 324
  • [32] Integrating Clustering and Learning for Improved Workload Prediction in the Cloud
    Yu, Yongjia
    Jindal, Vasu
    Yen, I-Ling
    Bastani, Farokh
    PROCEEDINGS OF 2016 IEEE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2016, : 876 - 879
  • [33] Machine Learning Based Workload Prediction in Cloud Computing
    Gao, Jiechao
    Wang, Haoyu
    Shen, Haiying
    2020 29TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS (ICCCN 2020), 2020,
  • [34] Factored Reinforcement Learning for Auto-scaling in Tandem Queues
    Tournaire, Thomas
    Jin, Yue
    Aghasaryan, Armen
    Castel-Taleb, Hind
    Hyon, Emmanuel
    PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
  • [35] Online machine learning for auto-scaling in the edge computing?
    da Silva, Thiago Pereira
    Neto, Aluizio Rocha
    Batista, Thais Vasconcelos
    Delicato, Flavia C.
    Pires, Paulo F.
    Lopes, Frederico
    PERVASIVE AND MOBILE COMPUTING, 2022, 87
  • [36] Efficient calculation of distributed photovoltaic power generation power prediction via deep learning
    Li, Jiaqian
    Rao, Congjun
    Gao, Mingyun
    Xiao, Xinping
    Goh, Mark
    RENEWABLE ENERGY, 2025, 246
  • [37] Auto uning of price prediction models for high-frequency trading via reinforcement learning
    Zhang, Weipeng
    Zhang, Ning
    Yan, Junchi
    Li, Guofu
    Yang, Xiaokang
    PATTERN RECOGNITION, 2022, 125
  • [38] Humas: A Heterogeneity- and Upgrade-Aware Microservice Auto-Scaling Framework in Large-Scale Data Centers
    Hua, Qin
    Yang, Dingyu
    Qian, Shiyou
    Cao, Jian
    Xue, Guangtao
    Li, Minglu
    IEEE TRANSACTIONS ON COMPUTERS, 2025, 74 (03) : 968 - 982
  • [39] Load balancing and auto-scaling issues in container microservice cloud-based system: a review on the current trend technologies
    Rabiu S.
    Yong C.H.
    Syed-Mohamad S.M.
    International Journal of Web Engineering and Technology, 2023, 18 (04) : 294 - 318
  • [40] Toward Better Service Performance Management via Workload Prediction
    Moussa, Hachem
    Yen, I-Ling
    Bastani, Farokh
    Dong, Yulin
    He, Wei
    SERVICES COMPUTING, SCC 2019, 2019, 11515 : 92 - 106