On Achieving Efficient Data Transfer for Graph Processing in Geo-Distributed Datacenters

被引:24
|
作者
Zhou, Amelie Chi [1 ]
Ibrahim, Shadi [1 ]
He, Bingsheng [2 ]
机构
[1] Inria Rennes, Bretagne Atlantique Res Ctr, Rennes, France
[2] Natl Univ Singapore, Singapore, Singapore
关键词
Graph partitioning; Heterogeneous network; Geo-distributed datacenters;
D O I
10.1109/ICDCS.2017.98
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Graph partitioning is important for optimizing the performance and communication cost of large graph processing jobs. Recently, many graph applications such as social networks store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the costly Wide Area Network (WAN) usage and the multi-levels of network heterogeneities in geo-distributed DCs. In this paper, we propose a geo-aware graph partitioning method named G-Cut, which aims at minimizing the inter-DC data transfer time of graph processing jobs in geo-distributed DCs while satisfying the WAN usage budget. G-Cut adopts two novel optimization phases which address the two challenges in WAN usage and network heterogeneities separately. G-Cut can be also applied to partition dynamic graphs thanks to its light-weight runtime overhead. We evaluate the effectiveness and efficiency of G-Cut using real world graphs with both real geo-distributed DCs and simulations. Evaluation results show that G-Cut can reduce the inter-DC data transfer time by up to 58% and reduce the WAN usage by up to 70% compared to state-of-the-art graph partitioning methods with a low runtime overhead.
引用
收藏
页码:1397 / 1407
页数:11
相关论文
共 50 条
  • [1] Efficient Graph Query Processing over Geo-Distributed Datacenters
    Yuan, Ye
    Ma, Delong
    Wen, Zhenyu
    Ma, Yuliang
    Wang, Guoren
    Chen, Lei
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 619 - 628
  • [2] Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters
    Zhou, Amelie Chi
    Shen, Bingkun
    Xiao, Yao
    Ibrahim, Shadi
    He, Bingsheng
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (07) : 1707 - 1723
  • [3] Towards Efficient Graph Processing in Geo-Distributed Data Centers
    Yao, Feng
    Tao, Qian
    Lin, Shengyuan
    Zhang, Yanfeng
    Yu, Wenyuan
    Gong, Shufeng
    Wang, Qiange
    Yu, Ge
    Zhou, Jingren
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (11) : 2147 - 2160
  • [4] Efficient Geo-Distributed Data Processing with Rout
    Jayalath, Chamikara
    Eugster, Patrick
    [J]. 2013 IEEE 33RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2013, : 470 - 480
  • [5] Cost-Aware Big Data Processing Across Geo-Distributed Datacenters
    Xiao, Wenhua
    Bao, Weidong
    Zhu, Xiaomin
    Liu, Ling
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (11) : 3114 - 3127
  • [6] On efficient virtual cluster scaling across geo-distributed datacenters
    Xu, Xinping
    Li, Wenxin
    Qi, Heng
    Li, Keqiu
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (10):
  • [7] Efficient Data and Task Co-Scheduling for Scientific Workflow in Geo-distributed Datacenters
    Chen, Jian
    Zhang, Jinghui
    Song, Aibo
    [J]. 2017 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2017, : 63 - 68
  • [8] Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters
    Wu, Yu
    Zhang, Zhizhong
    Wu, Chuan
    Guo, Chuanxiong
    Li, Zongpeng
    Lau, Francis C. M.
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2017, 5 (01) : 112 - 125
  • [9] Flutter: Scheduling Tasks Closer to Data Across Geo-Distributed Datacenters
    Hu, Zhiming
    Li, Baochun
    Luo, Jun
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [10] Scheduling Jobs Across Geo-distributed Datacenters
    Hung, Chien-Chun
    Golubchik, Leana
    Yu, Minlan
    [J]. ACM SOCC'15: PROCEEDINGS OF THE SIXTH ACM SYMPOSIUM ON CLOUD COMPUTING, 2015, : 111 - 124