Owl: Performance-Aware Scheduling for Resource-Efficient Function-as-a-Service Cloud

被引:9
|
作者
Tian, Huangshi [1 ]
Li, Suyi [1 ]
Wang, Ao [2 ,3 ]
Wang, Wei [1 ]
Wu, Tianlong [3 ]
Yang, Haoran [3 ]
机构
[1] HKUST, Hong Kong, Peoples R China
[2] George Mason Univ, Fairfax, VA 22030 USA
[3] Alibaba Grp, Hangzhou, Peoples R China
关键词
serverless; resource-management; scheduling; overcommitment;
D O I
10.1145/3542929.3563470
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work documents our experience of improving the scheduler in Alibaba Function Compute, a public FaaS platform. It commences with our observation that memory and CPU are under-utilized in most FaaS sandboxes. A natural solution is to overcommit VM resources when allocating sandboxes, whereas the ensuing contention may cause performance degradation and compromise user experience. To complicate matters, the degradation in FaaS can arise from external factors, such as failed dependencies of user functions. We design Owl to achieve both high utilization and performance stability. It introduces a customizable rule system for users to specify their toleration of degradation, and overcommits resources with a dual approach. (1) For less-invoked functions, it allocates resources to the sandboxes with usage-based heuristic, keeps monitoring their performance, and remedies any detected degradation. It differentiates whether a degraded sandbox is affected externally by separating a contention-free environment and migrating the affected sandbox into there as a comparison baseline. (2) For frequently-invoked functions, Owl profiles the interference patterns among collocated sandboxes and place the sandboxes under the guidance of profiles. The collocation profiling is designed to tackle the constraints that profiling has to be conducted in production. Owl further consolidates idle sandboxes to reduce resource waste. We prototype Owl in our production system and implement a representative benchmark suite to evaluate it. The results demonstrate that the prototype could reduce VM cost by 43.80% and effectively mitigate latency degradation, with negligible overhead incurred.
引用
收藏
页码:78 / 93
页数:16
相关论文
共 50 条
  • [1] GOLGI: Performance-Aware, Resource-Efficient Function Scheduling for Serverless Computing
    Li, Suyi
    Wang, Wei
    Yang, Jun
    Chen, Guangzhen
    Lu, Daohe
    [J]. PROCEEDINGS OF THE 2023 ACM SYMPOSIUM ON CLOUD COMPUTING, SOCC 2023, 2023, : 32 - 47
  • [2] Multilevel resource allocation for performance-aware energy-efficient cloud data centers
    Rossi, Fabio Diniz
    Severo de Souza, Paulo Silas
    Marques, Wagner dos Santos
    Conterato, Marcelo da Silva
    Ferreto, Tiago Coelho
    Lorenzon, Arthur Francisco
    Luizelli, Marcelo Caggiani
    [J]. 2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2019, : 462 - 467
  • [3] Energy and Performance-Aware Task Scheduling in a Mobile Cloud Computing Environment
    Lin, Xue
    Wang, Yanzhi
    Xie, Qing
    Pedram, Massoud
    [J]. 2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 192 - 199
  • [4] Towards Resource-Efficient Service Function Chain Deployment in Cloud-Fog Computing
    Zhao, Dongcheng
    Liao, Dan
    Sun, Gang
    Xu, Shizhong
    [J]. IEEE ACCESS, 2018, 6 : 66754 - 66766
  • [5] Dependency-aware and Resource-efficient Scheduling for Heterogeneous Jobs in Clouds
    Liu, Jinwei
    Shen, Haiying
    [J]. 2016 8TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM 2016), 2016, : 110 - 117
  • [6] An autonomic resource management system for energy efficient and quality of service aware resource scheduling in cloud environment
    Kumar, Ashok
    Lal, Madan
    Kaur, Sumandeep
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
  • [7] A performance-aware dynamic scheduling algorithm for cloud-based IoT applications
    Pandiyan, Sanjeevi
    Lawrence, T. Samraj
    Sathiyamoorthi, V
    Ramasamy, Manikandan
    Xia, Qian
    Guo, Ya
    [J]. COMPUTER COMMUNICATIONS, 2020, 160 : 512 - 520
  • [8] Cost- and performance-aware resource selection for parallel software on heterogeneous cloud
    Bystrov, Oleg
    Pacevic, Ruslan
    Kaceniauskas, Arnas
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (10):
  • [9] Performance-Aware Cloud Resource Allocation via Fitness-Enabled Auction
    Wang, Hongbing
    Kang, Zuling
    Wang, Lei
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (04) : 1160 - 1173
  • [10] Resource-efficient workflow scheduling in clouds
    Lee, Young Choon
    Han, Hyuck
    Zomaya, Albert Y.
    Yousif, Mazin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 80 : 153 - 162