Mining Top-k Pairs of Correlated Subgraphs in a Large Network

被引:9
|
作者
Prateek, Arneish [1 ]
Khan, Arijit [2 ]
Goyal, Akshit [1 ]
Ranu, Sayan [1 ]
机构
[1] Indian Inst Technol, Delhi, India
[2] Nanyang Technol Univ, Singapore, Singapore
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 09期
关键词
TOOL; ALIGNMENT;
D O I
10.14778/3397230.3397245
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
yWe investigate the problem of correlated subgraphs mining (CSM) where the goal is to identify pairs of subgraph patterns that frequently co-occur in proximity within a single graph. Correlated subgraph patterns are different from frequent subgraphs due to the flexibility in connections between constituent subgraph instances and thus, existing frequent subgraphs mining algorithms cannot be directly applied for CSM. Moreover, computing the degree of correlation between two patterns requires enumerating and finding distances between every pair of subgraph instances of both patterns - a task that is both memory-intensive as well as computationally demanding. To this end, we propose two holistic best-first exploration algorithms: CSM-E (an exact method) and CSM-A (a more efficient approximate method with near-optimal quality). To further improve efficiency, we propose a top-k pruning strategy, while to reduce memory footprint, we develop a compressed data structure called Replica, which stores all instances of a subgraph pattern on demand. Our empirical results demonstrate that the proposed algorithms not only mine interesting correlations, but also achieve good scalability over large networks.
引用
收藏
页码:1511 / 1524
页数:14
相关论文
共 50 条
  • [1] TOP-COP: Mining TOP-K strongly correlated pairs in large databases
    Xiong, Hui
    Brodie, Mark
    Ma, Sheng
    [J]. ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 1162 - 1166
  • [2] Mining Top-K Frequent Correlated Subgraph Pairs in Graph Databases
    Shang, Li
    Jian, Yujiao
    [J]. INTELLIGENT INFORMATICS, 2013, 182 : 1 - 8
  • [3] TKG: Efficient Mining of Top-K Frequent Subgraphs
    Fournier-Viger, Philippe
    Cheng, Chao
    Lin, Jerry Chun-Wei
    Yun, Unil
    Kiran, R. Uday
    [J]. BIG DATA ANALYTICS (BDA 2019), 2019, 11932 : 209 - 226
  • [4] Mining Top-K Large Structural Patterns in a Massive Network
    Zhu, Feida
    Qu, Qiang
    Lo, David
    Yan, Xifeng
    Han, Jiawei
    Yu, Philip S.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (11): : 807 - 818
  • [5] Mining top-k strongly correlated item pairs without minimum correlation threshold
    He, Zengyou
    Xu, Xiaofei
    Deng, Shengchun
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2006, 10 (02) : 105 - 112
  • [6] Efficient Mining of Top-K Strongly Correlated Item Pairs using One Pass Technique
    Roy, S.
    Bhattacharyya, D. K.
    [J]. ADCOM: 2008 16TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, 2008, : 416 - +
  • [7] Top-k overlapping densest subgraphs
    Galbrun, Esther
    Gionis, Aristides
    Tatti, Nikolaj
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (05) : 1134 - 1165
  • [8] Top-k overlapping densest subgraphs
    Esther Galbrun
    Aristides Gionis
    Nikolaj Tatti
    [J]. Data Mining and Knowledge Discovery, 2016, 30 : 1134 - 1165
  • [9] Fully Dynamic Algorithm for Top-k Densest Subgraphs
    Nasir, Muhammad Anis Uddin
    Gionis, Aristides
    Morales, Gianmarco De Francisci
    Girdzijauskas, Sarunas
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1817 - 1826
  • [10] Distributed Top-k Pattern Mining
    Wang, Xin
    Xiang, Mingyue
    Zhan, Huayi
    Lan, Zhuo
    He, Yuang
    He, Yanxiao
    Sha, Yuji
    [J]. WEB AND BIG DATA, APWEB-WAIM 2021, PT II, 2021, 12859 : 203 - 220