Mining Query-Based Subnetwork Outliers in Heterogeneous Information Networks

被引:11
|
作者
Zhuang, Honglei [1 ]
Zhang, Jing [2 ]
Brova, George [1 ]
Tang, Jie [2 ]
Cam, Hasan [3 ]
Yan, Xifeng [4 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Champaign, IL 61801 USA
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
[3] US Army, Res Lab, Adelphi, MD USA
[4] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
D O I
10.1109/ICDM.2014.85
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining outliers in a heterogeneous information network is a challenging problem: It is even unclear what should be outliers in a large heterogeneous network (e.g., outliers in the entire bibliographic network consisting of authors, titles, papers and venues). In this study, we propose an interesting class of outliers, query-based subnetwork outliers: Given a heterogeneous network, a user raises a query to retrieve a set of task-relevant subnetworks, among which, subnetwork outliers are those that significantly deviate from others (e.g., outliers of author groups among those studying "topic modeling"). We formalize this problem and propose a general framework, where one can query for finding subnetwork outliers with respect to different semantics. We introduce the notion of subnetwork similarity that captures the proximity between two subnetworks by their membership distributions. We propose an outlier detection algorithm to rank all the subnetworks according to their outlierness without tuning parameters. Our quantitative and qualitative experiments on both synthetic and real data sets show that the proposed method outperforms other baselines.
引用
收藏
页码:1127 / 1132
页数:6
相关论文
共 50 条
  • [31] Query-based summarization of customer reviews
    Feiguina, Olga
    Lapalme, Guy
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4509 : 452 - +
  • [32] Meta-Path-Based Search and Mining in Heterogeneous Information Networks
    Yizhou Sun
    Jiawei Han
    Tsinghua Science and Technology, 2013, 18 (04) : 329 - 338
  • [33] Query-Based Argumentation in Agent Programming
    Gottifredi, Sebastian
    Garcia, Alejandro J.
    Simari, Guillermo R.
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2010, 2010, 6433 : 284 - 295
  • [34] Query-Based Linked Data Anonymization
    Delanaux, Remy
    Bonifati, Angela
    Rousset, Marie-Christine
    Thion, Romuald
    SEMANTIC WEB - ISWC 2018, PT I, 2018, 11136 : 530 - 546
  • [35] Meta-Path-Based Search and Mining in Heterogeneous Information Networks
    Sun, Yizhou
    Han, Jiawei
    TSINGHUA SCIENCE AND TECHNOLOGY, 2013, 18 (04) : 329 - 338
  • [36] Query-based summarization of discussion threads
    Verberne, Suzan
    Krahmer, Emiel
    Wubben, Sander
    van den Bosch, Antal
    NATURAL LANGUAGE ENGINEERING, 2020, 26 (01) : 3 - 29
  • [37] Regularizing query-based retrieval scores
    Fernando Diaz
    Information Retrieval, 2007, 10 : 531 - 562
  • [38] Mining Heterogeneous Information Networks: Principles and Methodologies
    Sun, Yizhou
    Han, Jiawei
    Synthesis Lectures on Data Mining and Knowledge Discovery, 2012, 3 (02): : 1 - 161
  • [39] Query-based Client-Indexing in Client-Server-Information Systems
    Hoepfner, Hagen
    COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2006, 20 (04): : 209 - 221
  • [40] Mining outliers in spatial networks
    Jin, Wen
    Jiang, Yuelong
    Qian, Weining
    Tung, Anthony K. H.
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 156 - 170