Top-k similarity search in heterogeneous information networks with x-star network schema

被引:33
|
作者
Zhang, Mingxi [1 ,2 ]
Hu, Hao [2 ]
He, Zhenying [2 ]
Wang, Wei [2 ]
机构
[1] Univ Shanghai Sci & Technol, Coll Commun & Art Design, Shanghai 200093, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai 201203, Peoples R China
基金
美国国家科学基金会;
关键词
Similarity search; Information network; x-star network schema;
D O I
10.1016/j.eswa.2014.08.039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An x-star network is an information network which consists of centers with connections among themselves, and different type attributes linking to these centers. As x-star networks become ubiquitous, extracting knowledge from x-star networks has become an important task. Similarity search in x-star network aims to find the centers similar to a given query center, which has numerous applications including collaborative filtering, community mining and web search. Although existing methods yield promising similar results, such as SimRank and P-Rank, they are not applicable for massive x-star networks. In this paper, we propose a structural-based similarity measure, NetSim, towards efficiently computing similarity between centers in an x-star network. The similarity between attributes is computed in the pre-processing stage by the expected meeting probability over attribute network that is extracted from the whole structure of x-star network. The similarity between centers is computed online according to the attribute similarities based on the intuition that similar centers are linked with similar attributes. NetSim requires less time and space cost than existing methods since the scale of attribute network is significantly smaller than the whole x-star network. For supporting fast online query processing, we develop a pruning algorithm by building a pruning index, which prunes candidate centers that are not promising. Extensive experiments demonstrate the effectiveness and efficiency of our method through comparing with the state-of-the-art measures. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:699 / 712
页数:14
相关论文
共 50 条
  • [41] REPOSE: Distributed Top-k Trajectory Similarity Search with Local Reference Point Tries
    Zheng, Bolong
    Weng, Lianggui
    Zhao, Xi
    Zeng, Kai
    Zhou, Xiaofang
    Jensen, Christian S.
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 708 - 719
  • [42] Exploiting Transitive Similarity and Temporal Dynamics for Similarity Search in Heterogeneous Information Networks
    He, Jiazhen
    Bailey, James
    Zhang, Rui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT II, 2014, 8422 : 141 - 155
  • [43] Scalable top-k query on information networks with hierarchical inheritance relations
    Wu, Fubao
    Gao, Lixin
    DISTRIBUTED AND PARALLEL DATABASES, 2024, 42 (01) : 1 - 30
  • [44] Edit Distance Based Similarity Search of Heterogeneous Information Networks
    Lu, Jianhua
    Lu, Ningyun
    Ma, Sipei
    Zhang, Baili
    DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2018), PT II, 2018, 11030 : 195 - 202
  • [45] Neural PathSim for Inductive Similarity Search in Heterogeneous Information Networks
    Xiao, Wenyi
    Zhao, Huan
    Zheng, Vincent W.
    Song, Yangqiu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 2201 - 2210
  • [46] TSS: Temporal similarity search measure for heterogeneous information networks
    Nikmehr, Golnaz
    Salehi, Mostafa
    Jalili, Mandi
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2019, 524 : 696 - 707
  • [47] Scalable top-k query on information networks with hierarchical inheritance relations
    Fubao Wu
    Lixin Gao
    Distributed and Parallel Databases, 2024, 42 : 1 - 30
  • [48] Continuous Top-k Processing of Social Network Information Streams: A Vision
    Alkhouli, Abdulhafiz
    Vodislav, Dan
    Borzic, Boris
    INFORMATION SEARCH, INTEGRATION AND PERSONALIZATION, ISIP 2014, 2016, 497 : 35 - 48
  • [49] Heterogeneous Information Network-Based Patient Similarity Search
    Huang, Hao-zhe
    Lu, Xu-dong
    Guo, Wei
    Jiang, Xin-bo
    Yan, Zhong-min
    Wang, Shi-peng
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 9
  • [50] ALPS: an efficient algorithm for top-k spatial preference search in road networks
    Hyung-Ju Cho
    Se Jin Kwon
    Tae-Sun Chung
    Knowledge and Information Systems, 2015, 42 : 599 - 631