Scalable maximal subgraph mining with backbone-preserving graph convolutions

被引:1
|
作者
Nguyen, Thanh Toan [1 ]
Huynh, Thanh Trung [2 ]
Weidlich, Matthias [3 ]
Tho, Quan Thanh [4 ,5 ]
Yin, Hongzhi [6 ]
Aberer, Karl [2 ]
Nguyen, Quoc Viet Hung [7 ]
机构
[1] HUTECH Univ, Fac Informat Technol, Ho Chi Minh City, Vietnam
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Humboldt Univ, Berlin, Germany
[4] Ho Chi Minh City Univ Technol HCMUT, Fac Comp Sci & Engn, 268 Ly Thuong Kiet St, Dist 10, Ho Chi Minh City, Vietnam
[5] Vietnam Natl Univ Ho Chi Minh City, Linh Trung Ward, Ho Chi Minh City, Vietnam
[6] Univ Queensland, Brisbane, Australia
[7] Griffith Univ, Nathan, Australia
基金
瑞士国家科学基金会;
关键词
Maximal subgraph mining; Graph embedding; Graph convolutional networks; Scalable approximation; RUMOR DETECTION; NETWORK; EFFICIENT;
D O I
10.1016/j.ins.2023.119287
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Maximal subgraph mining is increasingly important in various domains, including bioinformatics, genomics, and chemistry, as it helps identify common characteristics among a set of graphs and enables their classification into different categories. Existing approaches for identifying maximal subgraphs typically rely on traversing a graph lattice. However, in practice, these approaches are limited to relatively small subgraphs due to the exponential growth of the search space and the NP-completeness of the underlying subgraph isomorphism test. In this work, we propose SCAMA, an approach that addresses these limitations by adopting a divide-and-conquer strategy for efficient mining of maximal subgraphs. Our approach involves initially partitioning a graph database into equivalence classes using bootstrapped backbones, which are tree-shaped frequent subgraphs. We then introduce a learning process based on a novel graph convolutional network (GCN) to extract maximal backbones for each equivalence class. A critical insight of our approach is that by estimating each maximal backbone directly in the embedding space, we can avoid the exponential traversal of the graph lattice. From the extracted maximal backbones, we construct the maximal frequent subgraphs. Furthermore, we outline how SCAMA can be extended to perform top-������ largest frequent subgraph mining and how the discovered patterns facilitate graph classification. Our experimental results demonstrate the effectiveness of SCAMA in identifying almost perfectly maximal frequent subgraphs, while exhibiting approximately 10 times faster performance compared to the best baseline technique.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Isomorphic Graph Embedding for Progressive Maximal Frequent Subgraph Mining
    Thanh Toan Nguyen
    Thanh Tam Nguyen
    Thanh Hung Nguyen
    Yin, Hongzhi
    Thanh Thi Nguyen
    Jo, Jun
    Quoc Viet Hung Nguyen
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (01)
  • [2] ScaleMine: Scalable Parallel Frequent Subgraph Mining in a Single Large Graph
    Abdelhamid, Ehab
    Abdelaziz, Ibrahim
    Kalnis, Panos
    Khayyat, Zuhair
    Jamour, Fuad
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 727 - 727
  • [3] ScaleMine: Scalable Parallel Frequent Subgraph Mining in a Single Large Graph
    Abdelhamid, Ehab
    Abdelaziz, Ibrahim
    Kalnis, Panos
    Khayyat, Zuhair
    Jamour, Fuad
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 716 - 726
  • [4] MARGIN: Maximal Frequent Subgraph Mining
    Thomas, Lini T.
    Valluri, Satyanarayana R.
    Karlapalem, Kamalakar
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2010, 4 (03)
  • [5] MARGIN: Maximal frequent subgraph mining
    Thomas, Lini T.
    Valluri, Satyanarayana R.
    Karlapalem, Kamalakar
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 1097 - +
  • [6] Subgraph mining in a large graph: A review
    Nguyen, Lam B. Q.
    Zelinka, Ivan
    Snasel, Vaclav
    Nguyen, Loan T. T.
    Vo, Bay
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2022, 12 (04)
  • [7] Subgraph Mining for Graph Neural Networks
    Kikaj, Adem
    Marra, Giuseppe
    De Raedt, Luc
    ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT I, IDA 2024, 2024, 14641 : 141 - 152
  • [8] A new algorithm for mining maximal frequent subgraph
    Wang, Zhisong
    Chai, Ran
    Journal of Computational Information Systems, 2010, 6 (02): : 469 - 476
  • [9] A graph distance metric based on the maximal common subgraph
    Bunke, H
    Shearer, K
    PATTERN RECOGNITION LETTERS, 1998, 19 (3-4) : 255 - 259
  • [10] Dense subgraph mining with a mixed graph model
    Keszler, Anita
    Sziranyi, Tamas
    Tuza, Zsolt
    PATTERN RECOGNITION LETTERS, 2013, 34 (11) : 1252 - 1262