Efficient structure similarity searches: a partition-based approach

被引:0
|
作者
Xiang Zhao
Chuan Xiao
Xuemin Lin
Wenjie Zhang
Yang Wang
机构
[1] National University of Defense Technology,
[2] Collaborative Innovation Center of Geospatial Technology,undefined
[3] Nagoya University,undefined
[4] The University of New South Wales,undefined
来源
The VLDB Journal | 2018年 / 27卷
关键词
Graph database; Similarity query; Graph edit distance; Top-; search;
D O I
暂无
中图分类号
学科分类号
摘要
Graphs are widely used to model complex data in many applications, such as bioinformatics, chemistry, social networks, pattern recognition. A fundamental and critical query primitive is to efficiently search similar structures in a large collection of graphs. This article mainly studies threshold-based graph similarity search with edit distance constraints. Existing solutions to the problem utilize fixed-size overlapping substructures to generate candidates, and thus become susceptible to large vertex degrees and distance thresholds. In this article, we present a partition-based approach to tackle the problem. By dividing data graphs into variable-size non-overlapping partitions, the edit distance constraint is converted to a graph containment constraint for candidate generation. We develop efficient query processing algorithms based on the novel paradigm. Moreover, candidate-pruning techniques and an improved graph edit distance verification algorithm are developed to boost the performance. In addition, a cost-aware graph partitioning method is devised to optimize the index. Extending the partition-based filtering paradigm, we present a solution to the top-k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document} graph similarity search problem, where tailored filtering, look-ahead and computation-sharing strategies are exploited. Using both public real-life and synthetic datasets, extensive experiments demonstrate that our approaches significantly outperform the baseline and its alternatives.
引用
收藏
页码:53 / 78
页数:25
相关论文
共 50 条
  • [1] Efficient structure similarity searches: a partition-based approach
    Zhao, Xiang
    Xiao, Chuan
    Lin, Xuemin
    Zhang, Wenjie
    Wang, Yang
    [J]. VLDB JOURNAL, 2018, 27 (01): : 53 - 78
  • [2] A Partition-Based Approach to Structure Similarity Search
    Zhao, Xiang
    Xiao, Chuan
    Lin, Xuemin
    Liu, Qing
    Zhang, Wenjie
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 7 (03): : 169 - 180
  • [3] FrepJoin: an efficient partition-based algorithm for edit similarity join
    Ji-zhou Luo
    Sheng-fei Shi
    Hong-zhi Wang
    Jian-zhong Li
    [J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18 : 1499 - 1510
  • [4] FrepJoin:an efficient partition-based algorithm for edit similarity join
    Ji-zhou LUO
    Sheng-fei SHI
    Hong-zhi WANG
    Jian-zhong LI
    [J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18 (10) : 1499 - 1510
  • [5] FrepJoin: an efficient partition-based algorithm for edit similarity join
    Luo, Ji-zhou
    Shi, Sheng-fei
    Wang, Hong-zhi
    Li, Jian-zhong
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2017, 18 (10) : 1499 - 1510
  • [6] Load Balancing for Partition-based Similarity Search
    Tang, Xun
    Alabduljalil, Maha
    Jin, Xin
    Yang, Tao
    [J]. SIGIR'14: PROCEEDINGS OF THE 37TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2014, : 193 - 202
  • [7] PARTITION-BASED PATTERN MATCHING APPROACH FOR EFFICIENT RETRIEVAL OF ARABIC TEXT
    Hakak, Saqib
    Kamsin, Amirrudin
    Shivakumara, Palaiahnakote
    Idris, Mohd Yamani Idna
    [J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2018, 31 (03) : 200 - 209
  • [8] An efficient partition-based parallel PageRank algorithm
    Manaskasemsak, B
    Rungsawang, A
    [J]. 11th International Conference on Parallel and Distributed Systems, Vol I, Proceedings, 2005, : 257 - 263
  • [9] Pass-Join: A Partition-based Method for Similarity Joins
    Li, Guoliang
    Deng, Dong
    Wang, Jiannan
    Feng, Jianhua
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 5 (03): : 253 - 264
  • [10] A practical partition-based approach for ontology version
    Wang, ZJ
    Zhang, SS
    Wang, YL
    Du, T
    [J]. CURRENT TRENDS IN HIGH PERFORMANCE COMPUTING AND ITS APPLICATIONS, PROCEEDINGS, 2005, : 495 - 499