Towards distributed node similarity search on graphs

被引:0
|
作者
Tianming Zhang
Yunjun Gao
Baihua Zheng
Lu Chen
Shiting Wen
Wei Guo
机构
[1] Zhejiang University of Technology,College of Computer Science and Software Engineering
[2] Singapore Management University,School of Information Systems
[3] Aalborg University,Department of Computer Science
[4] Zhejiang University,The Ningbo Institute of Technology
来源
World Wide Web | 2020年 / 23卷
关键词
Graph; Node similarity search; Distributed processing; Algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In this paper, we propose an efficiently distributed framework to support node similarity search on massive graphs, which considers both graph structure correlation and node attribute similarity in metric spaces. The framework consists of preprocessing stage and query stage. In the preprocessing stage, a parallel KD-tree construction (KDC) algorithm is developed to form a newly defined graph so-called hybrid graph, in order to integrate node attribute similarity into the original graph. To equally divide graph vertices into subsets, KDC adopts the KD-tree partitioning after the pivot mapping. In addition, two metric pruning rules and an optimized allocation strategy are presented to reduce communication and computation costs. In the query stage, based on the formed hybrid graph, we develop similarity search methods using random walk with restart (RWR) to measure node similarity. To boost efficiency, we derive tight bounds to rapidly shrink the search region. Extensive experiments with three real massive graphs are conducted to verify the effectiveness, efficiency, and scalability of our proposed techniques.
引用
收藏
页码:3025 / 3053
页数:28
相关论文
共 50 条
  • [31] Keyword Search over Distributed Graphs with Compressed Signature
    Yuan, Ye
    Lian, Xiang
    Chen, Lei
    Yu, Jeffery Xu
    Wang, Guoren
    Sun, Yongjiao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (06) : 1212 - 1225
  • [32] DKWS: A Distributed System for Keyword Search on Massive Graphs
    Jiang, Jiaxin
    Choi, Byron
    Huang, Xin
    Xu, Jianliang
    Bhowmick, Sourav S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (05) : 1935 - 1950
  • [33] Ant-Search Algorithm for Distributed Knowledge Graphs
    Chepizhko, Oleksandr
    Forgacs, Peter
    Schranz, Melanie
    SWARM INTELLIGENCE, ANTS 2024, 2024, 14987 : 243 - 245
  • [34] Efficient structural node similarity computation on billion-scale graphs
    Chen, Xiaoshuang
    Lai, Longbin
    Qin, Lu
    Lin, Xuemin
    VLDB JOURNAL, 2021, 30 (03): : 471 - 493
  • [35] Efficient structural node similarity computation on billion-scale graphs
    Xiaoshuang Chen
    Longbin Lai
    Lu Qin
    Xuemin Lin
    The VLDB Journal, 2021, 30 : 471 - 493
  • [36] A Distributed Algorithm for Computing the Node Search Number in Trees
    David Coudert
    Florian Huc
    Dorian Mazauric
    Algorithmica, 2012, 63 : 158 - 190
  • [37] A Distributed Algorithm for Computing the Node Search Number in Trees
    Coudert, David
    Huc, Florian
    Mazauric, Dorian
    ALGORITHMICA, 2012, 63 (1-2) : 158 - 190
  • [38] CloudRGK: Towards Private Similarity Measurement Between Graphs on the Cloud
    Yu, Linxiao
    Tao, Jun
    Xu, Yifan
    Wang, Haotian
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (04) : 1688 - 1701
  • [39] Towards a hierarchical similarity measure for studying dynamic hierarchical graphs
    Elayam, Maryam Maslek
    Ray, Cyril
    Claramunt, Christophe
    30TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2022, 2022, : 584 - 587
  • [40] Semantic Node Embeddings of Distributed Graphs Using Apache Spark
    Narayanan, V. Suriya
    Vijayakumar, Vijeth Bidare
    Venkatraman, Sai Raam
    Baruah, Pallav Kumar
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 709 - 713