Cost-Minimizing Online Algorithms for Geo-Distributed Data Analytics

被引:1
|
作者
Huang, Jiao [1 ,2 ]
Huang, Jing [1 ,2 ]
Gao, Shang [1 ,2 ]
Yang, Bo [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Jilin, Peoples R China
[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Approximate nested query; distributed stream processing; resource allocation; error guarantee; CLOUD;
D O I
10.1109/ACCESS.2019.2951682
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern enterprises often manage geographically distributed datacenters around the globe. In such environment, datasets are naturally collected and stored in different data centers and were later queried for complex analytics. In this paper, we study the Wide-Area Data Analytics problem, which aims to efficiently control data movements and achieve low latency for overall queries processing, both constrained by limited and expensive network resources across datacenters. Previous papers focus on offline settings of single analytical queries and do not consider time in optimizing system performance, and therefore ignores the dynamics of data and task placement in terms of inter-DC bandwidth utilization. In this paper, we consider the online setting and formulate a cost-minimizing optimization problem over time for arbitrary Directed Acyclic Graph query processing. Considering dynamics of network resource usage, we developed two online algorithms, Online Switch Resist (OSR) and Most Fixed Horizon Control (MFHC) with good competitive ratios. We performed extensive simulations and comparative studies using the TPC-CH benchmark and verified the efficacy of proposed algorithms. The algorithm we proposed is better than the existing algorithm, and its performance approximates the theoretical optimal value.
引用
收藏
页码:163515 / 163525
页数:11
相关论文
共 50 条
  • [1] A Network Cost-aware Geo-distributed Data Analytics System
    Oh, Kwangsung
    Chandra, Abhishek
    Weissman, Jon
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 649 - 658
  • [2] Network Cost-Aware Geo-Distributed Data Analytics System
    Oh, Kwangsung
    Zhang, Minmin
    Chandra, Abhishek
    Weissman, Jon
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (06) : 1407 - 1420
  • [3] Low Latency Geo-distributed Data Analytics
    Pu, Qifan
    Ananthanarayanan, Ganesh
    Bodik, Peter
    Kandula, Srikanth
    Akella, Aditya
    Bahl, Paramvir
    Stoica, Ion
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2015, 45 (04) : 421 - 434
  • [4] Low Latency Geo-distributed Data Analytics
    Pu, Qifan
    Ananthanarayanan, Ganesh
    Bodik, Peter
    Kandula, Srikanth
    Akella, Aditya
    Bahl, Paramvir
    Stoica, Ion
    SIGCOMM'15: PROCEEDINGS OF THE 2015 ACM CONFERENCE ON SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2015, : 421 - 434
  • [5] Optimizing Timeliness and Cost in Geo-Distributed Streaming Analytics
    Heintz, Benjamin
    Chandra, Abhishek
    Sitaraman, Ramesh K.
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (01) : 232 - 245
  • [6] Optimizing Timeliness and Cost in Geo-Distributed Streaming Analytics
    Heintz, Benjamin
    Chandra, Abhishek
    Sitaraman, Ramesh K.
    IEEE Transactions on Cloud Computing, 2020, 8 (01): : 232 - 245
  • [7] WANalytics: Geo-Distributed Analytics for a Data Intensive World
    Vulimiri, Ashish
    Curino, Carlo
    Godfrey, P. Brighten
    Jungblut, Thomas
    Karanasos, Konstantinos
    Padhye, Jitu
    Varghese, George
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1087 - 1092
  • [8] Bohr: Similarity Aware Geo-Distributed Data Analytics
    Li, Hangyu
    Xu, Hong
    Nutanong, Sarana
    CONEXT'18: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES, 2018, : 267 - 279
  • [9] Optimizing the Cost-Performance Tradeoff for Geo-distributed Data Analytics with Uncertain Demand
    Li, Wenxin
    Xu, Renhai
    Qi, Heng
    Li, Keqiu
    Zhou, Xiaobo
    2017 IEEE/ACM 25TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2017,
  • [10] Minimizing latency in geo-distributed clouds
    Marzieh Malekimajd
    Ali Movaghar
    Seyedmahyar Hosseinimotlagh
    The Journal of Supercomputing, 2015, 71 : 4423 - 4445