Endpoint-Flexible Coflow Scheduling Across Geo-Distributed Datacenters

被引:13
|
作者
Li, Wenxin [1 ]
Yuan, Xu [2 ]
Li, Keqiu [3 ]
Qi, Heng [4 ]
Zhou, Xiaobo [3 ]
Xu, Renhai [3 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Univ Louisiana Lafayette, Sch Comp & Informat, Lafayette, LA 70503 USA
[3] Tianjin Univ, Coll Intelligence & Comp, Tianjin Key Lab Adv Networking TANK, Tianjin 300350, Peoples R China
[4] Dalian Univ Technol, Sch Comp Sci & Technol, 2 Linggong Rd, Dalian 116023, Peoples R China
基金
国家重点研发计划;
关键词
Task analysis; Bandwidth; Scheduling; Heuristic algorithms; Distributed databases; Approximation algorithms; Data models; Inter-datacenter; coflow scheduling; CCT; deadline; endpoint flexibility;
D O I
10.1109/TPDS.2020.2992615
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Over the last decade, we have witnessed growing data volumes generated and stored across geographically distributed datacenters. Processing such geo-distributed datasets may suffer from significant slowdown as the underlying network flows have to go through the inter-datacenter networks with relatively low and highly heterogeneous available link bandwidth. Thus, optimizing the transmissions of inter-datacenter flows, especially coflows that capture application-level semantics, is important for improving the communication performance of such geo-distributed applications. However, prior solutions on coflow scheduling have significant limitations: they schedule coflows with already-fixed endpoints of flows, making them insufficient to optimize the coflow completion time (CCT). In this article, we focus on the problem of jointly considering endpoint placement and coflow scheduling to minimize the average CCT of coflows across geo-distributed datacenters. To solve this problem without any prior knowledge of coflow arrivals, we present a coflow-aware optimization framework called SmartCoflow. In SmartCoflow, we first apply an approximate algorithm to obtain the endpoint placement and scheduling decisions for a single coflow. Based on the single-coflow solution, we then develop an efficient online algorithm to handle the dynamically arrived coflows. Through rigorous theoretical analysis, we prove that SmartCoflow has a non-trivial competitive ratio. We also extend SmartCoflow to incorporate various design choices or requirements of applications and operators, such as enforcing an inter-datacenter bandwidth usage budget and considering coflow deadline. Through experimental results from testbed implementation and trace-driven simulations, we demonstrate that SmartCoflow can reduce the average CCT, lower bandwidth usage, and improve coflow deadline meet rate, when compared to the state-of-the-art scheduling-only method.
引用
收藏
页码:2466 / 2481
页数:16
相关论文
共 50 条
  • [41] Uncertainty Level-Based Algorithms by Managing Renewable Energy for Geo-Distributed Datacenters
    Padhi, Slokashree
    Subramanyam, R. B. V.
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (04): : 5337 - 5354
  • [42] Sketch-based Data Placement among Geo-distributed Datacenters for Cloud Storages
    Yu, Boyang
    Pan, Jianping
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [43] Load Balance Based Job Scheduling in Geo-Distributed Clouds
    Chunlin Li
    Jianhang Tang
    Youlong Luo
    [J]. Wireless Personal Communications, 2019, 107 : 169 - 192
  • [44] Load Balance Based Job Scheduling in Geo-Distributed Clouds
    Li, Chunlin
    Tang, Jianhang
    Luo, Youlong
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2019, 107 (01) : 169 - 192
  • [45] VNF Deployment and Flow Scheduling in Geo-distributed Data Centers
    Gu, Lin
    Chen, Xiaoxiao
    Jin, Hai
    Lu, Feng
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2018,
  • [46] A Scheduling Framework for Periodic Tasks in Geo-Distributed Data Centers
    Li, Yan
    Zhang, Hong
    Wang, Yong
    Liu, Xinran
    Zhang, Peng
    [J]. 9TH IEEE INTERNATIONAL SYMPOSIUM ON SERVICE-ORIENTED SYSTEM ENGINEERING (SOSE 2015), 2015, : 247 - 252
  • [47] Optimized Provisioning of SDN-enabled Virtual Networks in Geo-distributed Cloud Computing Datacenters
    Alhazmi, Khaled
    Shami, Abdallah
    Refaey, Ahmed
    [J]. JOURNAL OF COMMUNICATIONS AND NETWORKS, 2017, 19 (04) : 402 - 415
  • [48] Optimizing Geo-Distributed Data Analytics with Coordinated Task Scheduling and Routing
    Zhao, Laiping
    Yang, Yanan
    Munir, Ali
    Liu, Alex X.
    Li, Yue
    Qu, Wenyu
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (02) : 279 - 293
  • [49] Privacy-preserving workflow scheduling in geo-distributed data centers
    Xiao, Yao
    Zhou, Amelie Chi
    Yang, Xuan
    He, Bingsheng
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 130 : 46 - 58
  • [50] Dynamic Data Replication Across Geo-Distributed Cloud Data Centres
    Jayalakshmi, D. S.
    Ranjana, T. P. Rashmi
    Ramaswamy, Srinivasan
    [J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY (ICDCIT 2016), 2016, 9581 : 182 - 187