Top-k similarity search in heterogeneous information networks with x-star network schema

被引:33
|
作者
Zhang, Mingxi [1 ,2 ]
Hu, Hao [2 ]
He, Zhenying [2 ]
Wang, Wei [2 ]
机构
[1] Univ Shanghai Sci & Technol, Coll Commun & Art Design, Shanghai 200093, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai 201203, Peoples R China
基金
美国国家科学基金会;
关键词
Similarity search; Information network; x-star network schema;
D O I
10.1016/j.eswa.2014.08.039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An x-star network is an information network which consists of centers with connections among themselves, and different type attributes linking to these centers. As x-star networks become ubiquitous, extracting knowledge from x-star networks has become an important task. Similarity search in x-star network aims to find the centers similar to a given query center, which has numerous applications including collaborative filtering, community mining and web search. Although existing methods yield promising similar results, such as SimRank and P-Rank, they are not applicable for massive x-star networks. In this paper, we propose a structural-based similarity measure, NetSim, towards efficiently computing similarity between centers in an x-star network. The similarity between attributes is computed in the pre-processing stage by the expected meeting probability over attribute network that is extracted from the whole structure of x-star network. The similarity between centers is computed online according to the attribute similarities based on the intuition that similar centers are linked with similar attributes. NetSim requires less time and space cost than existing methods since the scale of attribute network is significantly smaller than the whole x-star network. For supporting fast online query processing, we develop a pruning algorithm by building a pruning index, which prunes candidate centers that are not promising. Extensive experiments demonstrate the effectiveness and efficiency of our method through comparing with the state-of-the-art measures. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:699 / 712
页数:14
相关论文
共 50 条
  • [1] Semantic Enhanced Top-k Similarity Search on Heterogeneous Information Networks
    Yu, Minghe
    Zhang, Yun
    Zhang, Tiancheng
    Yu, Ge
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT III, 2020, 12114 : 104 - 119
  • [2] Top-k Similarity Join in Heterogeneous Information Networks
    Xiong, Yun
    Zhu, Yangyong
    Yu, Philip S.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (06) : 1710 - 1723
  • [3] PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks
    Sunt, Yizhou
    Hant, Jiawei
    Yant, Xifeng
    Yu, Philip S.
    Wuo, Tianyi
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (11): : 992 - 1003
  • [4] Panther: Fast Top-k Similarity Search on Large Networks
    Zhang, Jing
    Tang, Jie
    Ma, Cong
    Tong, Hanghang
    Jing, Yu
    Li, Juanzi
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1445 - 1454
  • [5] Fast and Flexible Top-k Similarity Search on Large Networks
    Zhang, Jing
    Tang, Jie
    Ma, Cong
    Tong, Hanghang
    Jing, Yu
    Li, Juanzi
    Luyten, Walter
    Moens, Marie-Francine
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 36 (02)
  • [6] On Top-k Structural Similarity Search
    Lee, Pei
    Lakshmanan, Laks V. S.
    Yu, Jeffrey Xu
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 774 - 785
  • [7] Fast top-k similarity search in large dynamic attributed networks
    Meng, Zaiqiao
    Shen, Hong
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (06)
  • [8] A Distributed Approach for Top-k Star Queries on Massive Information Networks
    Jin, Jiahui
    Khemmarat, Samamon
    Gao, Lixin
    Luo, Junzhou
    2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 9 - 16
  • [9] Top-k Spatio-textual Similarity Search
    Liu, Sitong
    Chu, Yaping
    Hu, Huiqi
    Feng, Jianhua
    Zhu, Xuan
    WEB-AGE INFORMATION MANAGEMENT, WAIM 2014, 2014, 8485 : 602 - 614
  • [10] Scaling up top-K cosine similarity search
    Zhu, Shiwei
    Wu, Junjie
    Xiong, Hui
    Xia, Guoping
    DATA & KNOWLEDGE ENGINEERING, 2011, 70 (01) : 60 - 83