Efficient distributed subgraph similarity matching

被引:28
|
作者
Yuan, Ye [1 ]
Wang, Guoren [1 ]
Xu, Jeffery Yu [2 ]
Chen, Lei [3 ]
机构
[1] Northeastern Univ, Shenyang, Peoples R China
[2] Chinese Univ Hong Kong, Shatin, Hong Kong, Peoples R China
[3] Hong Kong Univ Sci & Technol, Hong Kong, Hong Kong, Peoples R China
来源
VLDB JOURNAL | 2015年 / 24卷 / 03期
关键词
Graph; Similarity matching; Distributed computing; GRAPH; SEARCH;
D O I
10.1007/s00778-015-0381-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given a query graph and a data graph , subgraph similarity matching is to retrieve all matches of in with the number of missing edges bounded by a given threshold . Many works have been conducted to study the problem of subgraph similarity matching due to its ability to handle applications involved with noisy or erroneous graph data. In practice, a data graph can be extremely large, e.g., a web-scale graph containing hundreds of millions of vertices and billions of edges. The state-of-the-art approaches employ centralized algorithms to process the subgraph similarity queries, and thus, they are infeasible for such a large graph due to the limited computational power and storage space of a centralized server. To address this problem, in this paper, we investigate subgraph similarity matching for a web-scale graph deployed in a distributed environment. We propose distributed algorithms and optimization techniques that exploit the properties of subgraph similarity matching, so that we can well utilize the parallel computing power and lower the communication cost among the distributed data centers for query processing. Specifically, we first relax and decompose into a minimum number of sub-queries. Next, we send each sub-query to conduct the exact matching in parallel. Finally, we schedule and join the exact matches to obtain final query answers. Moreover, our workload-balance strategy further speeds up the query processing. Our experimental results demonstrate the feasibility of our proposed approach in performing subgraph similarity matching over web-scale graph data.
引用
收藏
页码:369 / 394
页数:26
相关论文
共 50 条
  • [1] Efficient distributed subgraph similarity matching
    Ye Yuan
    Guoren Wang
    Jeffery Yu Xu
    Lei Chen
    The VLDB Journal, 2015, 24 : 369 - 394
  • [2] Distributed Subgraph Matching on Timely Dataflow
    Lai, Longbin
    Qin, Zhu
    Yang, Zhengyi
    Jin, Xin
    Lai, Zhengmin
    Wang, Ran
    Hao, Kongzhang
    Lin, Xuemin
    Qin, Lu
    Zhang, Wenjie
    Zhang, Ying
    Qian, Zhengping
    Zhou, Jingren
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (10): : 1099 - 1112
  • [3] A Similarity Measure for GPU Kernel Subgraph Matching
    Lim, Robert
    Norris, Boyana
    Malony, Allen
    LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC 2018), 2019, 11882 : 37 - 53
  • [4] Efficient Subgraph Matching Using GPUs
    Lin, Xiaojie
    Zhang, Rui
    Wen, Zeyi
    Wang, Hongzhi
    Qi, Jianzhong
    DATABASES THEORY AND APPLICATIONS, ADC 2014, 2014, 8506 : 74 - 85
  • [5] HLMA: An efficient subgraph matching algorithm
    Dai, Gang
    Xu, Baomin
    Yin, Hongfeng
    Journal of Computers (Taiwan), 2020, 31 (06) : 182 - 195
  • [6] Scaling Up Subgraph Query Processing with Efficient Subgraph Matching
    Sun, Shixuan
    Luo, Qiong
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 220 - 231
  • [7] Subgraph Matching with Set Similarity in a Large Graph Database
    Hong, Liang
    Zou, Lei
    Lian, Xiang
    Yu, Philip S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (09) : 2507 - 2521
  • [8] Distributed Top-k subgraph matching
    Lan C.
    Zhang Y.
    Xing C.
    Xing, Chunxiao (xingcx@tsinghua.edu.cn), 1600, Tsinghua University (56): : 871 - 877
  • [9] Efficient subgraph join based on connectivity similarity
    Yue Wang
    Hongzhi Wang
    Jianzhong Li
    Hong Gao
    World Wide Web, 2015, 18 : 871 - 887
  • [10] Efficient subgraph join based on connectivity similarity
    Wang, Yue
    Wang, Hongzhi
    Li, Jianzhong
    Gao, Hong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2015, 18 (04): : 871 - 887