Mining Top-k Pairs of Correlated Subgraphs in a Large Network

被引:10
|
作者
Prateek, Arneish [1 ]
Khan, Arijit [2 ]
Goyal, Akshit [1 ]
Ranu, Sayan [1 ]
机构
[1] Indian Inst Technol, Delhi, India
[2] Nanyang Technol Univ, Singapore, Singapore
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 09期
关键词
TOOL; ALIGNMENT;
D O I
10.14778/3397230.3397245
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
yWe investigate the problem of correlated subgraphs mining (CSM) where the goal is to identify pairs of subgraph patterns that frequently co-occur in proximity within a single graph. Correlated subgraph patterns are different from frequent subgraphs due to the flexibility in connections between constituent subgraph instances and thus, existing frequent subgraphs mining algorithms cannot be directly applied for CSM. Moreover, computing the degree of correlation between two patterns requires enumerating and finding distances between every pair of subgraph instances of both patterns - a task that is both memory-intensive as well as computationally demanding. To this end, we propose two holistic best-first exploration algorithms: CSM-E (an exact method) and CSM-A (a more efficient approximate method with near-optimal quality). To further improve efficiency, we propose a top-k pruning strategy, while to reduce memory footprint, we develop a compressed data structure called Replica, which stores all instances of a subgraph pattern on demand. Our empirical results demonstrate that the proposed algorithms not only mine interesting correlations, but also achieve good scalability over large networks.
引用
收藏
页码:1511 / 1524
页数:14
相关论文
共 50 条
  • [41] Distributed top-k aggregation queries at large
    Thomas Neumann
    Matthias Bender
    Sebastian Michel
    Ralf Schenkel
    Peter Triantafillou
    Gerhard Weikum
    [J]. Distributed and Parallel Databases, 2009, 26 : 3 - 27
  • [42] Top-k-FCI: mining top-k frequent closed itemsets in data streams
    Li, Jun
    Gong, Sen
    [J]. Journal of Computational Information Systems, 2011, 7 (13): : 4819 - 4826
  • [43] Efficient algorithms of mining top-k frequent closed itemsets
    Lan Yongjie
    Qiu Yong
    [J]. ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL II, 2007, : 551 - 554
  • [44] Mining top-k frequent patterns from uncertain databases
    Tuong Le
    Bay Vo
    Van-Nam Huynh
    Ngoc Thanh Nguyen
    Sung Wook Baik
    [J]. Applied Intelligence, 2020, 50 : 1487 - 1497
  • [45] Efficiently Mining Top-K High Utility Sequential Patterns
    Yin, Junfu
    Zheng, Zhigang
    Cao, Longbing
    Song, Yin
    Wei, Wei
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 1259 - 1264
  • [46] ETARM: an efficient top-k association rule mining algorithm
    Nguyen, Linh T. T.
    Bay Vo
    Nguyen, Loan T. T.
    Fournier-Viger, Philippe
    Selamat, Ali
    [J]. APPLIED INTELLIGENCE, 2018, 48 (05) : 1148 - 1160
  • [47] Mining of top-k high utility itemsets with negative utility
    Sun, Rui
    Han, Meng
    Zhang, Chunyan
    Shen, Mingyao
    Du, Shiyu
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (03) : 5637 - 5652
  • [48] ExMiner: An efficient algorithm for mining top-k frequent patterns
    Quang, Tran Minh
    Oyanagi, Shigeru
    Yamazaki, Katsuhiro
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 436 - 447
  • [49] Mining top-K frequent itemsets through progressive sampling
    Andrea Pietracaprina
    Matteo Riondato
    Eli Upfal
    Fabio Vandin
    [J]. Data Mining and Knowledge Discovery, 2010, 21 : 310 - 326
  • [50] An Efficient Method for Mining Top-K Closed Sequential Patterns
    Pham, Thi-Thiet
    Do, Tung
    Nguyen, Anh
    Vo, Bay
    Hong, Tzung-Pei
    [J]. IEEE ACCESS, 2020, 8 : 118156 - 118163