Cost-Effective Cloud Server Provisioning for Predictable Performance of Big Data Analytics

被引:32
|
作者
Xu, Fei [1 ]
Zheng, Haoyue [1 ]
Jiang, Huan [1 ]
Shao, Wujie [1 ]
Liu, Haikun [2 ]
Zhou, Zhi [3 ]
机构
[1] East China Normal Univ, Dept Comp Sci & Technol, Shanghai Key Lab Multidimens Informat Proc, 3663 N Zhongshan Rd, Shanghai 200062, Peoples R China
[2] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Cluster & Grid Comp Lab, Serv Comp Technol & Syst Lab, 1037 Luoyu Rd, Wuhan 430074, Hubei, Peoples R China
[3] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangdong Key Lab Big Data Anal & Proc, 132 E Waihuan Rd, Guangzhou 510006, Guangdong, Peoples R China
关键词
Predictable performance; big data analytics; cloud computing; transient server provisioning; data checkpointing;
D O I
10.1109/TPDS.2018.2873397
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Cloud datacenters are underutilized due to server over-provisioning. To increase datacenter utilization, cloud providers offer users an option to run workloads such as big data analytics on the underutilized resources, in the form of cheap yet revocable transient servers (e.g., EC2 spot instances, GCE preemptible instances). Though at highly reduced prices, deploying big data analytics on the unstable cloud transient servers can severely degrade the job performance due to instance revocations. To tackle this issue, this paper proposes iSpot, a cost-effective transient server provisioning framework for achieving predictable performance in the cloud, by focusing on Spark as a representative Directed Acyclic Graph (DAG)-style big data analytics workload. It first identifies the stable cloud transient servers during the job execution by devising an accurate Long Short-Term Memory (LSTM)-based price prediction method. Leveraging automatic job profiling and the acquired DAG information of stages, we further build an analytical performance model and present a lightweight critical data checkpointing mechanism for Spark, to enable our design of iSpot provisioning strategy for guaranteeing the job performance on stable transient servers. Extensive prototype experiments on both EC2 spot instances and GCE preemptible instances demonstrate that, iSpot is able to guarantee the performance of big data analytics running on cloud transient servers while reducing the job budget by up to 83.8 percent in comparison to the state-of-the-art server provisioning strategies, yet with acceptable runtime overhead.
引用
收藏
页码:1036 / 1051
页数:16
相关论文
共 50 条
  • [1] Cost-Effective Data Analytics across Multiple Cloud Regions
    Shu, Junyi
    Jin, Xin
    Ma, Yun
    Liu, Xuanzhe
    Huang, Gang
    [J]. PROCEEDINGS OF THE 2021 SIGCOMM 2021 POSTER AND DEMO SESSIONS, SIGCOMM 2021 DEMOS AND POSTERS, 2024, : 1 - 3
  • [2] Cost-Effective Resource Provisioning for MapReduce in a Cloud
    Palanisamy, Balaji
    Singh, Aameek
    Liu, Ling
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (05) : 1265 - 1279
  • [3] iSpot: Achieving Predictable Performance for Big Data Analytics with Cloud Transient Servers
    Xu, Fei
    Jiang, Huan
    Zheng, Haoyue
    Shao, Wujie
    [J]. 2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 314 - 321
  • [4] Towards Cost-Effective Cloud Downloading with Tencent Big Data
    Li, Zhen-Hua
    Liu, Gang
    Ji, Zhi-Yuan
    Zimmermann, Roger
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2015, 30 (06): : 1163 - 1174
  • [5] Towards Cost-Effective Cloud Downloading with Tencent Big Data
    Zhen-Hua Li
    Gang Liu
    Zhi-Yuan Ji
    Roger Zimmermann
    [J]. Journal of Computer Science and Technology, 2015, 30 : 1163 - 1174
  • [6] Cost-Effective Service Provisioning for Hybrid Cloud Applications
    Liu, Fangming
    Luo, Bin
    Niu, Yipei
    [J]. MOBILE NETWORKS & APPLICATIONS, 2017, 22 (02): : 153 - 160
  • [7] Cost-Effective Service Provisioning for Hybrid Cloud Applications
    Luo, Bin
    Niu, Yipei
    Liu, Fangming
    [J]. COLLABORATIVE COMPUTING: NETWORKING, APPLICATIONS, AND WORKSHARING, COLLABORATECOM 2015, 2016, 163 : 47 - 56
  • [8] Cost-Effective Service Provisioning for Hybrid Cloud Applications
    Fangming Liu
    Bin Luo
    Yipei Niu
    [J]. Mobile Networks and Applications, 2017, 22 : 153 - 160
  • [9] Cost-Effective, Workload-Adaptive Migration of Big Data Applications to the Cloud
    Giannakouris, Victor
    Fernandez, Alejandro
    Simitsis, Alkis
    Babu, Shivnath
    [J]. SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 1909 - 1912
  • [10] Cutting the Unnecessary Long Tail: Cost-Effective Big Data Clustering in the Cloud
    Li, Dongwei
    Wang, Shuliang
    Gao, Nan
    He, Qiang
    Yang, Yun
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (01) : 292 - 303