Towards distributed node similarity search on graphs

被引:5
|
作者
Zhang, Tianming [1 ]
Gao, Yunjun [1 ]
Zheng, Baihua [2 ]
Chen, Lu [3 ]
Wen, Shiting [4 ]
Guo, Wei [1 ]
机构
[1] Zhejiang Univ Technol, Coll Comp Sci & Software Engn, Hangzhou, Peoples R China
[2] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
[3] Aalborg Univ, Dept Comp Sci, Aalborg, Denmark
[4] Zhejiang Univ, Ningbo Inst Technol, Ningbo, Peoples R China
基金
国家重点研发计划;
关键词
Graph; Node similarity search; Distributed processing; Algorithm;
D O I
10.1007/s11280-020-00819-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Node similarity search on graphs has wide applications in recommendation, link prediction, to name just a few. However, existing studies are insufficient due to two reasons: (i) the scale of the real-world graph is growing rapidly, and (ii) vertices are always associated with complex attributes. In this paper, we propose an efficiently distributed framework to support node similarity search on massive graphs, which considers both graph structure correlation and node attribute similarity in metric spaces. The framework consists of preprocessing stage and query stage. In the preprocessing stage, a parallel KD-tree construction (KDC) algorithm is developed to form a newly defined graph so-calledhybrid graph, in order to integrate node attribute similarity into the original graph. To equally divide graph vertices into subsets, KDC adopts the KD-tree partitioning after the pivot mapping. In addition, two metric pruning rules and an optimized allocation strategy are presented to reduce communication and computation costs. In the query stage, based on the formed hybrid graph, we develop similarity search methods using random walk with restart (RWR) to measure node similarity. To boost efficiency, we derive tight bounds to rapidly shrink the search region. Extensive experiments with three real massive graphs are conducted to verify the effectiveness, efficiency, and scalability of our proposed techniques.
引用
收藏
页码:3025 / 3053
页数:29
相关论文
共 50 条
  • [31] Efficient structural node similarity computation on billion-scale graphs
    Chen, Xiaoshuang
    Lai, Longbin
    Qin, Lu
    Lin, Xuemin
    [J]. VLDB JOURNAL, 2021, 30 (03): : 471 - 493
  • [32] Efficient structural node similarity computation on billion-scale graphs
    Xiaoshuang Chen
    Longbin Lai
    Lu Qin
    Xuemin Lin
    [J]. The VLDB Journal, 2021, 30 : 471 - 493
  • [33] A Distributed Algorithm for Computing the Node Search Number in Trees
    David Coudert
    Florian Huc
    Dorian Mazauric
    [J]. Algorithmica, 2012, 63 : 158 - 190
  • [34] A Distributed Algorithm for Computing the Node Search Number in Trees
    Coudert, David
    Huc, Florian
    Mazauric, Dorian
    [J]. ALGORITHMICA, 2012, 63 (1-2) : 158 - 190
  • [35] Towards a hierarchical similarity measure for studying dynamic hierarchical graphs
    Elayam, Maryam Maslek
    Ray, Cyril
    Claramunt, Christophe
    [J]. 30TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2022, 2022, : 584 - 587
  • [36] Semantic Node Embeddings of Distributed Graphs Using Apache Spark
    Narayanan, V. Suriya
    Vijayakumar, Vijeth Bidare
    Venkatraman, Sai Raam
    Baruah, Pallav Kumar
    [J]. 2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 709 - 713
  • [37] LSHDB: A Parallel and Distributed Engine for Record Linkage and Similarity Search
    Karapiperis, Dimitrios
    Gkoulalas-Divanis, Aris
    Verykios, Vassilios S.
    [J]. 2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 1336 - 1339
  • [38] Distributed Efficient Similarity Search Mechanism in Wireless Sensor Networks
    Ahmed, Khandakar
    Gregory, Mark A.
    [J]. SENSORS, 2015, 15 (03) : 5474 - 5503
  • [39] Fast image similarity search by distributed locality sensitive hashing
    Durmaz, Osman
    Bilge, Hasan Sakir
    [J]. PATTERN RECOGNITION LETTERS, 2019, 128 : 361 - 369
  • [40] Odyssey: A Journey in the Land of Distributed Data Series Similarity Search
    Chatzakis, Manos
    Fatourou, Panagiota
    Kosmas, Eleftherios
    Palpanas, Themis
    Peng, Botao
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (05): : 1140 - 1153