Ditto: Efficient Serverless Analytics with Elastic Parallelism

被引:5
|
作者
Jin, Chao [1 ]
Zhang, Zili [1 ]
Xiang, Xingyu [1 ]
Zou, Songyun [1 ]
Huang, Gang [1 ]
Liu, Xuanzhe [1 ]
Jin, Xin [1 ]
机构
[1] Peking Univ, Beijing, Peoples R China
关键词
Serverless computing; data analytics; task scheduling;
D O I
10.1145/3603269.3604816
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Serverless computing provides fine-grained resource elasticity for data analytics-a job can flexibly scale its resources for each stage, instead of sticking to a fixed pool of resources throughout its lifetime. Due to different data dependencies and different shuffling overheads caused by intra- and inter-server communication, the best degree of parallelism (DoP) for each stage varies based on runtime conditions. We present Ditto, a job scheduler for serverless analytics that leverages fine-grained resource elasticity to optimize for job completion time (JCT) and cost. The key idea of Ditto is to use a new scheduling granularity-stage group-to decouple parallelism configuration from function placement. Ditto bundles stages into stage groups based on their data dependencies and IO characteristics. It exploits the parallelized time characteristics of the stages to determine the parallelism configuration, and prioritizes the placement of stage groups with large shuffling traffic, so that the stages in these groups can leverage zero-copy intra-server communication for efficient shuffling. We build a system prototype of Ditto and evaluate it with a variety of benchmarking workloads. Experimental results show that Ditto outperforms existing solutions by up to 2.5x on JCT and up to 1.8x on cost.
引用
收藏
页码:406 / 419
页数:14
相关论文
共 50 条
  • [1] Pocket: Elastic Ephemeral Storage for Serverless Analytics
    Klimovic, Ana
    Wang, Yawen
    Stuedi, Patrick
    Trivedi, Animesh
    Pfefferle, Jonas
    Kozyrakis, Christos
    PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2018, : 427 - 444
  • [2] Jiffy: Elastic Far-Memory for Stateful Serverless Analytics
    Khandelwal, Anurag
    Tang, Yupeng
    Agarwal, Rachit
    Akella, Aditya
    Stoica, Ion
    PROCEEDINGS OF THE SEVENTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS (EUROSYS '22), 2022, : 697 - 713
  • [3] Serverless Data Analytics with Flint
    Kim, Youngbin
    Lin, Jimmy
    PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2018, : 451 - 455
  • [4] Enhancing Observability of Serverless Computing with the Serverless Application Analytics Framework
    Cordingly, Robert
    Heydari, Navid
    Yu, Hanfei
    Hoang, Varik
    Sadeghi, Zohreh
    Lloyd, Wes
    COMPANION OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE 2021, 2021, : 161 - 164
  • [5] Towards Efficient Elastic Parallelism for Deep Learning Processor
    Cheng, Jinyu
    Qian, Ruyi
    Shi, Qinwen
    Hu, Gaomei
    Ciao, Mengjuan
    Huo, Qirun
    Xu, Yuanchao
    2022 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING, ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM, 2022, : 363 - 370
  • [6] Serverless Data Analytics in the IBM Cloud
    Sampe, Josep
    Vernik, Gil
    Sanchez-Artigas, Marc
    Garcia-Lopez, Pedro
    MIDDLEWARE INDUSTRY'18: PROCEEDINGS OF THE 2018 ACM/IFIP/USENIX MIDDLEWARE CONFERENCE (INDUSTRIAL TRACK), 2018, : 1 - 8
  • [7] Understanding Ephemeral Storage for Serverless Analytics
    Klimovic, Ana
    Wang, Yawen
    Kozyrakis, Christos
    Stuedi, Patrick
    Pfefferle, Jonas
    Trivedi, Animesh
    PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE, 2018, : 789 - 794
  • [8] MXFaaS: Resource Sharing in Serverless Environments for Parallelism and Efficiency
    Stojkovic, Jovan
    Xu, Tianyin
    Franke, Hubertus
    Torrellas, Josep
    PROCEEDINGS OF THE 2023 THE 50TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, ISCA 2023, 2023, : 474 - 488
  • [9] Serverless distributed learning for smart grid analytics
    黄刚
    吴超
    胡一帆
    郭创新
    Chinese Physics B, 2021, 30 (08) : 650 - 657
  • [10] Serverless distributed learning for smart grid analytics*
    Huang, Gang
    Wu, Chao
    Hu, Yifan
    Guo, Chuangxin
    CHINESE PHYSICS B, 2021, 30 (08)