An Efficient Similarity Search Framework for SimRank over Large Dynamic Graphs

被引:44
|
作者
Shao, Yingxia [1 ]
Cui, Bin [1 ]
Chen, Lei [2 ]
Liu, Mingming [1 ]
Xie, Xing [3 ]
机构
[1] Peking Univ, Sch EECS, Key Lab High Confidence Software Technol MOE, Beijing, Peoples R China
[2] HKUST, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[3] Microsoft Res, New York, NY USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2015年 / 8卷 / 08期
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
D O I
10.14778/2757807.2757809
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
SimRank is an important measure of vertex-pair similarity according to the structure of graphs. The similarity search based on SimRank is an important operation for identifying similar vertices in a graph and has been employed in many data analysis applications. Nowadays, graphs in the real world become much larger and more dynamic. The existing solutions for similarity search are expensive in terms of time and space cost. None of them can efficiently support similarity search over large dynamic graphs. In this paper, we propose a novel two-stage random-walk sampling framework (TSF) for SimRank-based similarity search (e.g., top-k search). the preprocessing stage, TSE samples a set of one-way graphs to index raw random walks in a novel manner within 00111,) time and space, where N is the number of vertices and is the number of one-way graphs. The one-way graph can be efficiently updated in accordance with the graph modification, thus TSF is well suited to dynamic graphs. During the query stage, TSF can search similar vertices fast by naturally pruning unqualified vertices based on the connectivity of one-way graphs. Furthermore, with additional R-q samples, TSF can estimate the SimRank score with probability 1- 2e(-2 epsilon 2 RgRq/(1 - c)2) if the error of approximation is bounded by. Finally, to guarantee the scalability of TSF, the one-way graphs can also be compactly stored on the disk when the memory is limited. Extensive experiments have demonstrated that TSF can handle dynamic billion-edge graphs with high performance.
引用
收藏
页码:838 / 849
页数:12
相关论文
共 50 条
  • [21] Efficient Similarity Search on Quasi-Metric Graphs
    Zhang, Tianming
    Gao, Yunjun
    Chen, Lu
    Chen, Guanlin
    Pu, Shiliang
    IEEE ACCESS, 2019, 7 : 101496 - 101512
  • [22] Efficiently Indexing Large Sparse Graphs for Similarity Search
    Wang, Guoren
    Wang, Bin
    Yang, Xiaochun
    Yu, Ge
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (03) : 440 - 451
  • [23] An Experimental Evaluation of SimRank-based Similarity Search Algorithms
    Zhang, Zhipeng
    Shao, Yingxia
    Cui, Bin
    Zhang, Ce
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (05): : 601 - 612
  • [24] Top-k Community Similarity Search Over Large Road-Network Graphs
    Rai, Niranjan
    Lian, Xiang
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2093 - 2098
  • [25] Efficient Similarity Search over Encrypted Data
    Kuzu, Mehmet
    Islam, Mohammad Saiful
    Kantarcioglu, Murat
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 1156 - 1167
  • [26] Efficient subgraph search on large anonymized graphs
    Ding, Xiaofeng
    Ou, Yangling
    Jia, Jianhong
    Jin, Hai
    Liu, Jixue
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (23):
  • [27] Efficient Parallel Cycle Search in Large Graphs
    Qing, Zhu
    Yuan, Long
    Chen, Zi
    Lin, Jingjing
    Ma, Guojie
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT II, 2020, 12113 : 349 - 367
  • [28] Efficient Subgraph Search on Large Anonymized Graphs
    Ding, Xiaofeng
    Ou, Yangling
    Jia, Jianhong
    Jin, Hai
    Liu, Jixue
    2017 INTERNATIONAL CONFERENCE ON GREEN INFORMATICS (ICGI), 2017, : 223 - 228
  • [29] Holistic Subgraph Search over Large Graphs
    Peng, Peng
    Zou, Lei
    Wang, Dong
    Zhao, Dongyan
    WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 208 - 212
  • [30] Semantic SPARQL Similarity Search Over RDF Knowledge Graphs
    Zheng, Weiguo
    Zou, Lei
    Peng, Wei
    Yan, Xifeng
    Song, Shaoxu
    Zhao, Dongyan
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (11): : 840 - 851