Efficient Graph Query Processing over Geo-Distributed Datacenters

被引:8
|
作者
Yuan, Ye [1 ]
Ma, Delong [2 ]
Wen, Zhenyu [3 ]
Ma, Yuliang [2 ]
Wang, Guoren [1 ]
Chen, Lei [4 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
[2] Northeastern Univ, Shenyang, Peoples R China
[3] Newcastle Univ, Newcastle Upon Tyne, Tyne & Wear, England
[4] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
关键词
Graph search; Geo-distributed; Datacenters; MAPREDUCE;
D O I
10.1145/3397271.3401157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Graph queries have emerged as one of the fundamental techniques to support modern search services, such as PageRank web search, social networking search and knowledge graph search. As such graphs are maintained globally and very huge (e.g., billions of nodes), we need to efficiently process graph queries across multiple geographically distributed datacenters, running geo-distributed graph queries. Existing graph computing frameworks may not work well for geographically distributed datacenters, because they implement a Bulk Synchronous Parallel model that requires excessive inter-datacenter transfers, thereby introducing extremely large latency for query processing. In this paper, we propose GeoGraph-a universal framework to support efficient geo-distributed graph query processing based on clustering datacenters and meta-graph, while reducing the inter-datacenter communication. Our new framework can be applied to many types of graph algorithms without any modification. The framework is developed on the top of Apache Giraph. The experiments were conducted by applying four important graph queries, i.e., shortest path, graph keyword search, subgraph isomorphism and PageRank. The evaluation results show that our proposed framework can achieve up to 82% faster convergence, 42% lower WAN bandwidth usage, and 45% less total monetary cost for the four graph queries, with input graphs stored across ten geo-distributed datacenters.
引用
收藏
页码:619 / 628
页数:10
相关论文
共 50 条
  • [21] Transformation-Based Streaming Workflow Allocation on Geo-Distributed Datacenters for Streaming Big Data Processing
    Chen, Wuhui
    Paik, Incheon
    Hung, Patrick C. K.
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2019, 12 (04) : 654 - 668
  • [22] Online Control of Service Function Chainings Across Geo-Distributed Datacenters
    Yang, Song
    Li, Fan
    Zhou, Zhi
    Chen, Xu
    Wang, Yu
    Fu, Xiaoming
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (06) : 3558 - 3571
  • [23] Online Scaling of NFV Service Chains Across Geo-Distributed Datacenters
    Jia, Yongzheng
    Wu, Chuan
    Li, Zongpeng
    Le, Franck
    Liu, Alex
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2018, 26 (02) : 699 - 710
  • [24] Optimizing Geo-Distributed Data Processing with Resource Heterogeneity over the Internet
    Marzuni, Saeed mirpour
    Toosi, Adel
    Savadi, Abdorreza
    Naghibzadeh, Mahmud
    Taniar, David
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2025, 25 (01)
  • [25] Graph partition-based data and task co-scheduling of scientific workflow in geo-distributed datacenters
    Zhang, Jinghui
    Chen, Jian
    Zhan, Jun
    Jin, Jiahui
    Song, Aibo
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (24):
  • [26] Workload and energy management of geo-distributed datacenters considering demand response programs
    Zhao, Mengmeng
    Wang, Xiaoying
    Mo, Junrong
    SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2023, 55
  • [27] Optimizing the Cost-Performance Tradeoff for Coflows Across Geo-Distributed Datacenters
    Xu, Xinping
    Li, Wenxin
    Li, Keqiu
    Qi, Heng
    Jin, Yingwei
    IEEE ACCESS, 2018, 6 : 24488 - 24497
  • [28] Scheduling Jobs across Geo-Distributed Datacenters with Max-Min Fairness
    Chen, Li
    Liu, Shuhao
    Li, Baochun
    Li, Bo
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2019, 6 (03): : 488 - 500
  • [29] Optimal Query Plans for Geo-distributed Data Analytics at Scale
    Pradhan, Ahana
    Karthik, Srinivas
    Subramanya, Raghunandan
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 247 - 251
  • [30] Compliant Geo-distributed Data Processing in Action
    Beedkar, Kaustubh
    Brekardin, David
    Quiane-Ruiz, Jorge-Anulfo
    Markl, Volker
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (12): : 2843 - 2846