Backup or Not: An Online Cost Optimal Algorithm for Data Analysis Jobs Using Spot Instances

被引:4
|
作者
Lin, Liduo [1 ]
Pan, Li [1 ]
Liu, Shijun [1 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China
来源
IEEE ACCESS | 2020年 / 8卷
关键词
Spot instance; online algorithm; back up; abrupt termination; BIG DATA;
D O I
10.1109/ACCESS.2020.3014978
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, large-scale public cloud providers begin to offer spot instances. This type of instance has become popular with more and more cloud users in the light of its convenient access mode and low price, especially for those big data analysis jobs with high performance computation requirements. However, using spot instances may carry the risk of being interrupted and lead to extra costs for job re-executions because these instances are generally unstable. Yet, such cost can be greatly reduced if a backup can be made at the right time before interruptions. For convenience and cost efficiency, users can choose the StaaS (Storage-as-a-Service) storage provided by the same cloud provider, whose spot instances are used by the users, to store backup data files for future job execution recovery. Since making backups too often will incur increased costs, users need to make the backup decisions appropriately considering the condition when an abrupt interruption will occur in the future. However, it is hard to know or predict precisely when such an interruption will occur. For solving this problem, in this article, we propose an online algorithm to guide cloud users to make backups when using spot instances to execute big data analysis jobs, without requiring any information about future interruptions. We prove theoretically that our proposed online algorithm can guarantee a bounded competitive ratio less than 2. Finally, according to extensive experiments, we verify the effectiveness of our online algorithm in reducing the additional cost caused by interruptions in using spot instances and find that our online algorithm can still achieve a stable cost optimization even if interruptions occur frequently.
引用
收藏
页码:144945 / 144956
页数:12
相关论文
共 50 条
  • [21] Online platforms for research data: A requirements and cost analysis
    Reichenbach, Rebecca
    Eberl, Christoph
    Lindenmeier, Joerg
    [J]. SCIENCE AND PUBLIC POLICY, 2022, 49 (04) : 598 - 608
  • [22] PackCache: An Online Cost-Driven Data Caching Algorithm in the Cloud
    Wu, Jiashu
    Dai, Hao
    Wang, Yang
    Zhang, Yong
    Huang, Dong
    Xu, Chengzhong
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (04) : 1208 - 1214
  • [23] An Online Algorithm for Power-proportional Data Centers with Switching Cost
    Zhang, Ming
    Zheng, Zizhan
    Shroff, Ness B.
    [J]. 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 6025 - 6032
  • [24] Data correlation analysis for optimal sensor placement using a bond energy algorithm
    Lu, Wei
    Wen, Runfa
    Teng, Jun
    Li, Xiaoling
    Li, Chao
    [J]. MEASUREMENT, 2016, 91 : 509 - 518
  • [25] Optimal preemptive semi-online algorithm for scheduling tightly-grouped jobs on two uniform machines
    Jiang, YW
    He, Y
    [J]. ASIA-PACIFIC JOURNAL OF OPERATIONAL RESEARCH, 2006, 23 (01) : 77 - 88
  • [26] INVESTIGATION INTO OPTIMAL FIXTURING COST OF AN ASSEMBLY USING GENETIC ALGORITHM
    Kumar, E. Raj
    Annamalai, K.
    [J]. ENGINEERING REVIEW, 2014, 34 (02) : 85 - 91
  • [27] An improved seismic data completion algorithm using low-rank tensor optimization: Cost reduction and optimal data orientation
    Popa, Jonathan
    Minkoff, Susan E.
    Lou, Yifei
    [J]. GEOPHYSICS, 2021, 86 (03) : V219 - V232
  • [28] Cost-efficient Data Acquisition on Online Data Marketplaces for Correlation Analysis
    Li, Yanying
    Sun, Haipei
    Dong, Boxiang
    Wang, Hui
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 12 (04): : 362 - 375
  • [29] Online Spatial Data Analysis and Algorithm Development for Geo-scientific Applications Using Remote Sensing Data
    Karnatak, Harish C.
    Singh, Hariom
    Garg, R. D.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES INDIA SECTION A-PHYSICAL SCIENCES, 2017, 87 (04) : 701 - 712
  • [30] Online Spatial Data Analysis and Algorithm Development for Geo-scientific Applications Using Remote Sensing Data
    Harish C. Karnatak
    Hariom Singh
    R. D. Garg
    [J]. Proceedings of the National Academy of Sciences, India Section A: Physical Sciences, 2017, 87 : 701 - 712