Ditto: Efficient Serverless Analytics with Elastic Parallelism

被引:5
|
作者
Jin, Chao [1 ]
Zhang, Zili [1 ]
Xiang, Xingyu [1 ]
Zou, Songyun [1 ]
Huang, Gang [1 ]
Liu, Xuanzhe [1 ]
Jin, Xin [1 ]
机构
[1] Peking Univ, Beijing, Peoples R China
关键词
Serverless computing; data analytics; task scheduling;
D O I
10.1145/3603269.3604816
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Serverless computing provides fine-grained resource elasticity for data analytics-a job can flexibly scale its resources for each stage, instead of sticking to a fixed pool of resources throughout its lifetime. Due to different data dependencies and different shuffling overheads caused by intra- and inter-server communication, the best degree of parallelism (DoP) for each stage varies based on runtime conditions. We present Ditto, a job scheduler for serverless analytics that leverages fine-grained resource elasticity to optimize for job completion time (JCT) and cost. The key idea of Ditto is to use a new scheduling granularity-stage group-to decouple parallelism configuration from function placement. Ditto bundles stages into stage groups based on their data dependencies and IO characteristics. It exploits the parallelized time characteristics of the stages to determine the parallelism configuration, and prioritizes the placement of stage groups with large shuffling traffic, so that the stages in these groups can leverage zero-copy intra-server communication for efficient shuffling. We build a system prototype of Ditto and evaluate it with a variety of benchmarking workloads. Experimental results show that Ditto outperforms existing solutions by up to 2.5x on JCT and up to 1.8x on cost.
引用
收藏
页码:406 / 419
页数:14
相关论文
共 50 条
  • [41] In Search of a Fast and Efficient Serverless DAG Engine
    Carver, Benjamin
    Zhang, Jingyuan
    Wang, Ao
    Cheng, Yue
    PROCEEDINGS OF PDSW 2019: 2019 IEEE/ACM FOURTH INTERNATIONAL PARALLEL DATA SYSTEMS WORKSHOP (PDSW), 2019, : 1 - 10
  • [42] Efficient and Flexible Component Placement for Serverless Computing
    Luo, Shouxi
    Li, Ke
    Xing, Huanlai
    Fan, Pingzhi
    IEEE SYSTEMS JOURNAL, 2024, 18 (02): : 1104 - 1114
  • [43] Jolteon and Ditto: Network-Adaptive Efficient Consensus with Asynchronous Fallback
    Gelashvili, Rati
    Kokoris-Kogias, Lefteris
    Sonnino, Alberto
    Spiegelman, Alexander
    Xiang, Zhuolun
    FINANCIAL CRYPTOGRAPHY AND DATA SECURITY, FC 2022, 2022, 13411 : 296 - 315
  • [44] Challenges and Opportunities for Efficient Serverless Computing at the Edge
    Gadepalli, Phani Kishore
    Peach, Gregor
    Cherkasova, Ludmila
    Aitken, Rob
    Parmer, Gabriel
    2019 IEEE 38TH INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS 2019), 2019, : 261 - 266
  • [45] Revisiting Edge and Node Parallelism for Dynamic GPU Graph Analytics
    McLaughlin, Adam
    Bader, David A.
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2014, : 1397 - 1407
  • [46] AutoToken: Predicting Peak Parallelism for Big Data Analytics at Microsoft
    Sen, Rathijit
    Jindal, Alekh
    Patel, Hiren
    Qiao, Shi
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 3326 - 3339
  • [47] Lambada: Interactive Data Analytics on Cold Data Using Serverless Cloud Infrastructure
    Muller, Ingo
    Marroquin, Renato
    Alonso, Gustavo
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 115 - 130
  • [48] Astra: Autonomous Serverless Analytics with Cost-Efficiency and QoS-Awareness
    Jarachanthan, Jananie
    Chen, Li
    Xu, Fei
    Li, Bo
    2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 756 - 765
  • [49] Smartpick: Workload Prediction for Serverless-enabled Scalable Data Analytics Systems
    Das Mohapatra, Anshuman
    Oh, Kwangsung
    PROCEEDINGS OF THE 24TH ACM/IFIP INTERNATIONAL MIDDLEWARE CONFERENCE, MIDDLEWARE 2023, 2023, : 29 - 42
  • [50] Edge-assisted Adaptive Configuration for Serverless-based Video Analytics
    Wang, Ziyi
    Zhang, Songyu
    Cheng, Jing
    Wu, Zhixiong
    Cao, Zhen
    Cui, Yong
    2023 IEEE 43RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, ICDCS, 2023, : 248 - 258