Efficient and Effective Similarity Search over Bipartite Graphs

被引:2
|
作者
Yang, Renchi [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
关键词
Bipartite Graphs; Similarity Search; Approximate Algorithms; PERSONALIZED PAGERANK QUERIES; RANDOM-WALK; COMPUTATION; ALGORITHMS;
D O I
10.1145/3485447.3511959
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Similarity search over a bipartite graph aims to retrieve from the graph the nodes that are similar to each other, which finds applications in various fields such as online advertising, recommender systems etc. Existing similarity measures either (i) overlook the unique properties of bipartite graphs, or (ii) fail to capture highorder information between nodes accurately, leading to suboptimal result quality. Recently, Hidden Personalized PageRank (HPP) is applied to this problem and found to be more effective compared with prior similarity measures. However, existing solutions for HPP computation incur significant computational costs, rendering it inefficient especially on large graphs. In this paper, we first identify an inherent drawback of HPP and overcome it by proposing bidirectional HPP (BHPP). Then, we formulate similarity search over bipartite graphs as the problem of approximate BHPP computation, and present an efficient solution Approx-BHPP. Specifically, Approx-BHPP offers rigorous theoretical accuracy guarantees with optimal computational complexity by combining deterministic graph traversal with matrix operations in an optimized and non-trivial way. Moreover, our solution achieves significant gain in practical efficiency due to several carefully-designed optimizations. Extensive experiments, comparing BHPP against 8 existing similarity measures over 7 real bipartite graphs, demonstrate the effectiveness of BHPP on query rewriting and item recommendation. Moreover, Approx-BHPP outperforms baseline solutions often by up to orders of magnitude in terms of computational time on both small and large datasets.
引用
收藏
页码:308 / 318
页数:11
相关论文
共 50 条
  • [41] (α, β)-AWCS: (α, β)-Attributed Weighted Community Search on Bipartite Graphs
    Li, Dengshi
    Liang, Xiaocong
    Hu, Ruimin
    Zeng, Lu
    Wang, Xiaochen
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [42] Malware Variant Detection Using Similarity Search over Sets of Control Flow Graphs
    Cesare, Silvio
    Xiang, Yang
    TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 181 - 189
  • [43] Efficient search over incomplete knowledge graphs in binarized embedding space
    Wang, Meng
    Chen, Weitong
    Wang, Sen
    Jiang, Yinlin
    Yao, Lina
    Qi, Guilin
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 123 : 24 - 34
  • [44] Effective and Efficient Algorithms for Flexible Aggregate Similarity Search in High Dimensional Spaces
    Houle, Michael E.
    Ma, Xiguo
    Oria, Vincent
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (12) : 3258 - 3273
  • [45] Efficient algorithms for similarity search
    Rajasekaran, S
    Hu, Y
    Luo, J
    Nick, H
    Pardalos, PM
    Sahni, S
    Shaw, G
    JOURNAL OF COMBINATORIAL OPTIMIZATION, 2001, 5 (01) : 125 - 132
  • [46] Efficient Algorithms for Similarity Search
    S. Rajasekaran
    Y. Hu
    J. Luo
    H. Nick
    P.M. Pardalos
    S. Sahni
    G. Shaw
    Journal of Combinatorial Optimization, 2001, 5 : 125 - 132
  • [47] Efficient similarity join for certain graphs
    Ruan, Qunsheng
    Wu, Qingfeng
    Liu, Xiling
    Miao, Fengyu
    Wang, Yingdong
    MICROSYSTEM TECHNOLOGIES-MICRO-AND NANOSYSTEMS-INFORMATION STORAGE AND PROCESSING SYSTEMS, 2021, 27 (04): : 1665 - 1685
  • [48] Efficient similarity join for certain graphs
    Qunsheng Ruan
    Qingfeng Wu
    Xiling Liu
    Fengyu Miao
    Yingdong Wang
    Microsystem Technologies, 2021, 27 : 1665 - 1685
  • [49] Effective and Efficient Community Search in Directed Graphs Across Heterogeneous Social Networks
    Wang, Zezhong
    Yuan, Ye
    Zhou, Xiangmin
    Qin, Hongchao
    DATABASES THEORY AND APPLICATIONS, ADC 2020, 2020, 12008 : 161 - 172
  • [50] Similarity measures over refinement graphs
    Santiago Ontañón
    Enric Plaza
    Machine Learning, 2012, 87 : 57 - 92