Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters

被引:66
|
作者
Wu, Yu [1 ]
Zhang, Zhizhong [1 ]
Wu, Chuan [1 ]
Guo, Chuanxiong [2 ]
Li, Zongpeng [3 ]
Lau, Francis C. M. [1 ]
机构
[1] Univ Hong Kong, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res Asia, Wireless & Networking Grp, Beijing, Peoples R China
[3] Univ Calgary, Dept Comp Sci, Calgary, AB, Canada
关键词
Bulk data transfers; geo-distributed datacenters; software-defined networking;
D O I
10.1109/TCC.2015.2389842
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As it has become the norm for cloud providers to host multiple datacenters around the globe, significant demands exist for inter-datacenter data transfers in large volumes, e.g., migration of big data. A challenge arises on how to schedule the bulk data transfers at different urgency levels, in order to fully utilize the available inter-datacenter bandwidth. The Software Defined Networking (SDN) paradigm has emerged recently which decouples the control plane from the data paths, enabling potential global optimization of data routing in a network. This paper aims to design a dynamic, highly efficient bulk data transfer service in a geo-distributed datacenter system, and engineer its design and solution algorithms closely within an SDN architecture. We model data transfer demands as delay tolerant migration requests with different finishing deadlines. Thanks to the flexibility provided by SDN, we enable dynamic, optimal routing of distinct chunks within each bulk data transfer (instead of treating each transfer as an infinite flow), which can be temporarily stored at intermediate datacenters to mitigate bandwidth contention with more urgent transfers. An optimal chunk routing optimization model is formulated to solve for the best chunk transfer schedules over time. To derive the optimal schedules in an online fashion, three algorithms are discussed, namely a bandwidth-reserving algorithm, a dynamically-adjusting algorithm, and a future-demand-friendly algorithm, targeting at different levels of optimality and scalability. We build an SDN system based on the Beacon platform and OpenFlow APIs, and carefully engineer our bulk data transfer algorithms in the system. Extensive real-world experiments are carried out to compare the three algorithms as well as those from the existing literature, in terms of routing optimality, computational delay and overhead.
引用
收藏
页码:112 / 125
页数:14
相关论文
共 50 条
  • [1] Optimizing Network Transfers for Data Analytic Jobs Across Geo-Distributed Datacenters
    Chen, Li
    Liu, Shuhao
    Li, Baochun
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (02) : 403 - 414
  • [2] GreenBDT: Renewable-aware scheduling of bulk data transfers for geo-distributed sustainable datacenters
    Lu, Xingjian
    Jiang, Dongxu
    He, Gaoqi
    Yu, Huiqun
    [J]. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2018, 20 : 120 - 129
  • [3] Scheduling Jobs Across Geo-distributed Datacenters
    Hung, Chien-Chun
    Golubchik, Leana
    Yu, Minlan
    [J]. ACM SOCC'15: PROCEEDINGS OF THE SIXTH ACM SYMPOSIUM ON CLOUD COMPUTING, 2015, : 111 - 124
  • [4] Flutter: Scheduling Tasks Closer to Data Across Geo-Distributed Datacenters
    Hu, Zhiming
    Li, Baochun
    Luo, Jun
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [5] Optimizing Concurrent Evacuation Transfers for Geo-Distributed Datacenters in SDN
    Li, Xiaole
    Wang, Hua
    Yi, Shanwen
    Yao, Xibo
    Zhu, Fangjin
    Zhai, Linbo
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2017, 2017, 10393 : 99 - 114
  • [6] Calantha: Content Distribution across Geo-Distributed Datacenters
    Li, Yangyang
    Zhang, Linchao
    Jia, Yue
    Liao, Yong
    Xie, Haiyong
    [J]. 2017 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2017, : 724 - 729
  • [7] Cost-Aware Big Data Processing Across Geo-Distributed Datacenters
    Xiao, Wenhua
    Bao, Weidong
    Zhu, Xiaomin
    Liu, Ling
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (11) : 3114 - 3127
  • [8] Joint Online Coflow Optimization Across Geo-Distributed Datacenters
    Wu, Zhaoxi
    [J]. IEEE ACCESS, 2020, 8 : 213602 - 213610
  • [9] On efficient virtual cluster scaling across geo-distributed datacenters
    Xu, Xinping
    Li, Wenxin
    Qi, Heng
    Li, Keqiu
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (10):
  • [10] Bulk Savings for Bulk Transfers: Minimizing the Energy-Cost for Geo-Distributed Data Centers
    Lu, Xingjian
    Kong, Fanxin
    Liu, Xue
    Yin, Jianwei
    Xiang, Qiao
    Yu, Huiqun
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (01) : 73 - 85