Efficient and Effective Similarity Search over Bipartite Graphs

被引:2
|
作者
Yang, Renchi [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
关键词
Bipartite Graphs; Similarity Search; Approximate Algorithms; PERSONALIZED PAGERANK QUERIES; RANDOM-WALK; COMPUTATION; ALGORITHMS;
D O I
10.1145/3485447.3511959
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Similarity search over a bipartite graph aims to retrieve from the graph the nodes that are similar to each other, which finds applications in various fields such as online advertising, recommender systems etc. Existing similarity measures either (i) overlook the unique properties of bipartite graphs, or (ii) fail to capture highorder information between nodes accurately, leading to suboptimal result quality. Recently, Hidden Personalized PageRank (HPP) is applied to this problem and found to be more effective compared with prior similarity measures. However, existing solutions for HPP computation incur significant computational costs, rendering it inefficient especially on large graphs. In this paper, we first identify an inherent drawback of HPP and overcome it by proposing bidirectional HPP (BHPP). Then, we formulate similarity search over bipartite graphs as the problem of approximate BHPP computation, and present an efficient solution Approx-BHPP. Specifically, Approx-BHPP offers rigorous theoretical accuracy guarantees with optimal computational complexity by combining deterministic graph traversal with matrix operations in an optimized and non-trivial way. Moreover, our solution achieves significant gain in practical efficiency due to several carefully-designed optimizations. Extensive experiments, comparing BHPP against 8 existing similarity measures over 7 real bipartite graphs, demonstrate the effectiveness of BHPP on query rewriting and item recommendation. Moreover, Approx-BHPP outperforms baseline solutions often by up to orders of magnitude in terms of computational time on both small and large datasets.
引用
收藏
页码:308 / 318
页数:11
相关论文
共 50 条
  • [31] Efficient Graph Similarity Search Over Large Graph Databases
    Zheng, Weiguo
    Zou, Lei
    Lian, Xiang
    Wang, Dong
    Zhao, Dongyan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (04) : 964 - 978
  • [32] Efficient similarity search over future stream time series
    Lian, Xiang
    Chen, Lei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (01) : 40 - 54
  • [33] Efficient SimRank-based Similarity Join Over Large Graphs
    Zheng, Weiguo
    Zou, Lei
    Feng, Yansong
    Chen, Lei
    Zhao, Dongyan
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (07): : 493 - 504
  • [34] Learning to Hash for Efficient Search over Incomplete Knowledge Graphs
    Wang, Meng
    Shen, Haomin
    Wang, Sen
    Yao, Lina
    Jiang, Yinlin
    Qi, Guilin
    Chen, Yang
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1360 - 1365
  • [35] Challenges and Techniques for Effective and Efficient Similarity Search in Large Video Databases
    Shao, Jie
    Shen, Heng Tao
    Zhou, Xiaofang
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1598 - 1603
  • [36] Text Categorization via Similarity Search An Efficient and Effective Novel Algorithm
    Duan, Hubert Haoyang
    Pestov, Vladimir G.
    Singla, Varun
    SIMILARITY SEARCH AND APPLICATIONS (SISAP), 2013, 8199 : 182 - 193
  • [37] Achieving Efficient and Privacy-Preserving (α, β)-Core Query Over Bipartite Graphs in Cloud
    Guan, Yunguo
    Lu, Rongxing
    Zheng, Yandong
    Zhang, Songnian
    Shao, Jun
    Wei, Guiyi
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (03) : 1979 - 1993
  • [38] EFFICIENT PARALLEL ALGORITHMS FOR BIPARTITE PERMUTATION GRAPHS
    CHEN, L
    YESHA, Y
    NETWORKS, 1993, 23 (01) : 29 - 39
  • [39] Highly Efficient String Similarity Search and Join over Compressed Indexes
    Xiao, Guorui
    Wang, Jin
    Lin, Chunbin
    Zaniolo, Carlo
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 232 - 244
  • [40] Node Embedding over Attributed Bipartite Graphs
    Ahmed, Hasnat
    Zhang, Yangyang
    Zafar, Muhammad Shoaib
    Sheikh, Nasrullah
    Tai, Zhenying
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT I, 2020, 12274 : 202 - 210