Mining Top-k Pairs of Correlated Subgraphs in a Large Network

被引:10
|
作者
Prateek, Arneish [1 ]
Khan, Arijit [2 ]
Goyal, Akshit [1 ]
Ranu, Sayan [1 ]
机构
[1] Indian Inst Technol, Delhi, India
[2] Nanyang Technol Univ, Singapore, Singapore
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 09期
关键词
TOOL; ALIGNMENT;
D O I
10.14778/3397230.3397245
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
yWe investigate the problem of correlated subgraphs mining (CSM) where the goal is to identify pairs of subgraph patterns that frequently co-occur in proximity within a single graph. Correlated subgraph patterns are different from frequent subgraphs due to the flexibility in connections between constituent subgraph instances and thus, existing frequent subgraphs mining algorithms cannot be directly applied for CSM. Moreover, computing the degree of correlation between two patterns requires enumerating and finding distances between every pair of subgraph instances of both patterns - a task that is both memory-intensive as well as computationally demanding. To this end, we propose two holistic best-first exploration algorithms: CSM-E (an exact method) and CSM-A (a more efficient approximate method with near-optimal quality). To further improve efficiency, we propose a top-k pruning strategy, while to reduce memory footprint, we develop a compressed data structure called Replica, which stores all instances of a subgraph pattern on demand. Our empirical results demonstrate that the proposed algorithms not only mine interesting correlations, but also achieve good scalability over large networks.
引用
收藏
页码:1511 / 1524
页数:14
相关论文
共 50 条
  • [31] Targeted mining of top-k high utility itemsets
    Huang, Shan
    Gan, Wensheng
    Miao, Jinbao
    Han, Xuming
    Fournier-Viger, Philippe
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [32] Supervised Evaluation of Top-k Itemset Mining Algorithms
    Lucchese, Claudio
    Orlando, Salvatore
    Perego, Raffaele
    [J]. BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, 2015, 9263 : 82 - 94
  • [33] TSP: Mining top-k closed sequential patterns
    Petre Tzvetkov
    Xifeng Yan
    Jiawei Han
    [J]. Knowledge and Information Systems, 2005, 7 : 438 - 457
  • [34] An Improved Algorithm for Mining Top-k Association Rules
    Nguyen, Linh T. T.
    Nguyen, Loan T. T.
    Bay Vo
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, ICCSAMA 2017, 2018, 629 : 117 - 128
  • [35] Mining top-k frequent closed iternsets is not in APX
    Wu, Chienwen
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2006, 3918 : 435 - 439
  • [36] TopUMS: Top-k Utility Mining in Stream Data
    Song, Wei
    Fang, Caiyu
    Gan, Wensheng
    [J]. 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 615 - 622
  • [37] Mining top-k granular association rules for recommendation
    Min, Fan
    Zhu, William
    [J]. PROCEEDINGS OF THE 2013 JOINT IFSA WORLD CONGRESS AND NAFIPS ANNUAL MEETING (IFSA/NAFIPS), 2013, : 1372 - 1376
  • [38] A Unified Approach for Computing Top-k Pairs in Multidimensional Space
    Cheema, Muhammad Aamir
    Lin, Xuemin
    Wang, Haixun
    Wang, Jianmin
    Zhang, Wenjie
    [J]. IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 1031 - 1042
  • [39] Efficiently Monitoring Top-k Pairs over Sliding Windows
    Shen, Zhitao
    Cheema, Muhammad Aamir
    Lin, Xuemin
    Zhang, Wenjie
    Wang, Haixun
    [J]. 2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 798 - 809
  • [40] Distributed top-k aggregation queries at large
    Neumann, Thomas
    Bender, Matthias
    Michel, Sebastian
    Schenkel, Ralf
    Triantafillou, Peter
    Weikum, Gerhard
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2009, 26 (01) : 3 - 27