Scalable maximal subgraph mining with backbone-preserving graph convolutions

被引:1
|
作者
Nguyen, Thanh Toan [1 ]
Huynh, Thanh Trung [2 ]
Weidlich, Matthias [3 ]
Tho, Quan Thanh [4 ,5 ]
Yin, Hongzhi [6 ]
Aberer, Karl [2 ]
Nguyen, Quoc Viet Hung [7 ]
机构
[1] HUTECH Univ, Fac Informat Technol, Ho Chi Minh City, Vietnam
[2] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[3] Humboldt Univ, Berlin, Germany
[4] Ho Chi Minh City Univ Technol HCMUT, Fac Comp Sci & Engn, 268 Ly Thuong Kiet St, Dist 10, Ho Chi Minh City, Vietnam
[5] Vietnam Natl Univ Ho Chi Minh City, Linh Trung Ward, Ho Chi Minh City, Vietnam
[6] Univ Queensland, Brisbane, Australia
[7] Griffith Univ, Nathan, Australia
基金
瑞士国家科学基金会;
关键词
Maximal subgraph mining; Graph embedding; Graph convolutional networks; Scalable approximation; RUMOR DETECTION; NETWORK; EFFICIENT;
D O I
10.1016/j.ins.2023.119287
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Maximal subgraph mining is increasingly important in various domains, including bioinformatics, genomics, and chemistry, as it helps identify common characteristics among a set of graphs and enables their classification into different categories. Existing approaches for identifying maximal subgraphs typically rely on traversing a graph lattice. However, in practice, these approaches are limited to relatively small subgraphs due to the exponential growth of the search space and the NP-completeness of the underlying subgraph isomorphism test. In this work, we propose SCAMA, an approach that addresses these limitations by adopting a divide-and-conquer strategy for efficient mining of maximal subgraphs. Our approach involves initially partitioning a graph database into equivalence classes using bootstrapped backbones, which are tree-shaped frequent subgraphs. We then introduce a learning process based on a novel graph convolutional network (GCN) to extract maximal backbones for each equivalence class. A critical insight of our approach is that by estimating each maximal backbone directly in the embedding space, we can avoid the exponential traversal of the graph lattice. From the extracted maximal backbones, we construct the maximal frequent subgraphs. Furthermore, we outline how SCAMA can be extended to perform top-������ largest frequent subgraph mining and how the discovered patterns facilitate graph classification. Our experimental results demonstrate the effectiveness of SCAMA in identifying almost perfectly maximal frequent subgraphs, while exhibiting approximately 10 times faster performance compared to the best baseline technique.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Mining Maximal Cliques from an Uncertain Graph
    Mukherjee, Arko Provo
    Xu, Pan
    Tirthapura, Srikanta
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 243 - 254
  • [42] Mining λ-Maximal Cliques from a Fuzzy Graph
    Hao, Fei
    Park, Doo-Soon
    Li, Shuai
    Lee, Hwa Min
    SUSTAINABILITY, 2016, 8 (06)
  • [43] Efficient and Scalable Graph Pattern Mining on GPUs
    Chen, Xuhao
    Arvind
    PROCEEDINGS OF THE 16TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, OSDI 2022, 2022, : 857 - 877
  • [44] PegasusN: A Scalable and Versatile Graph Mining System
    Park, Ha-Myung
    Park, Chiwan
    Kang, U.
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 8214 - 8215
  • [45] Subgraph similarity maximal all-matching over a large uncertain graph
    Yu Gu
    Chunpeng Gao
    Lulu Wang
    Ge Yu
    World Wide Web, 2016, 19 : 755 - 782
  • [46] Subgraph similarity maximal all-matching over a large uncertain graph
    Gu, Yu
    Gao, Chunpeng
    Wang, Lulu
    Yu, Ge
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (05): : 755 - 782
  • [47] MRFS: Mining Rating Fraud Subgraph in Bipartite Graph for Users and Products
    Yu, Wei
    Wang, Wenkai
    Xu, Guangquan
    Wu, Huaming
    Li, Hongyan
    Wang, Jun
    Li, Xiaoming
    Liu, Juan
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (03) : 3108 - 3117
  • [48] Efficient mining of minimal distinguishing subgraph patterns from graph databases
    Zeng, Zhiping
    Wang, Jianyong
    Zhou, Lizhu
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 1062 - 1068
  • [49] EFFICIENT SOFTWARE FAULT LOCALIZATION BY HIERARCHICAL INSTRUMENTATION AND MAXIMAL FREQUENT SUBGRAPH MINING
    Ren, Jiadong
    Wang, Huifang
    Ma, Yue
    Li, Yanling
    Dong, Jun
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2015, 11 (06): : 1897 - 1911
  • [50] PMCS: Partition-based Maximal Frequent Subgraph Mining using MCS
    Sancheti, Vani
    Thomas, Lini T.
    Pudi, Vikram
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 159 - 168