MANIACS: Approximate Mining of Frequent Subgraph Patterns through Sampling

被引:5
|
作者
Preti, Giulia [1 ]
Morales, Gianmarco De Francisci [1 ]
Riondato, Matteo [2 ]
机构
[1] CENTAI, Corso Inghilterra 3, I-10138 Turin, Italy
[2] Amherst Coll, Dept Comp Sci, Box 2232, Amherst, MA 01002 USA
基金
美国国家科学基金会;
关键词
Minimum Node Image; pattern mining; VC-dimension; GRAPHLETS;
D O I
10.1145/3587254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present MANIACS, a sampling-based randomized algorithm for computing high-quality approximations of the collection of the subgraph patterns that are frequent in a single, large, vertex-labeled graph, according to the Minimum Node Image-based (MNI) frequency measure. The output of MANIACS comes with strong probabilistic guarantees, obtained by using the empirical Vapnik-Chervonenkis (VC) dimension, a key concept from statistical learning theory, together with strong probabilistic tail bounds on the difference between the frequency of a pattern in the sample and its exact frequency. MANIACS leverages properties of the MNI-frequency to aggressively prune the pattern search space, and thus to reduce the time spent in exploring subspaces that contain no frequent patterns. In turn, this pruning leads to better bounds to the maximum frequency estimation error, which leads to increased pruning, resulting in a beneficial feedback effect. The results of our experimental evaluation of MANIACS on real graphs show that it returns high-quality collections of frequent patterns in large graphs up to two orders of magnitude faster than the exact algorithm.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Efficient Frequent Subgraph Mining in Transactional Databases
    Welke, Pascal
    2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020), 2020, : 307 - 314
  • [42] A new algorithm for mining maximal frequent subgraph
    Wang, Zhisong
    Chai, Ran
    Journal of Computational Information Systems, 2010, 6 (02): : 469 - 476
  • [43] Mining Frequent Subgraph by Incidence Matrix Normalization
    Wu, Jia
    Chen, Ling
    JOURNAL OF COMPUTERS, 2008, 3 (10) : 109 - 115
  • [44] A Distributed Approach to Weighted Frequent Subgraph Mining
    Babu, Nisha
    John, Ansamma
    IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGICAL TRENDS IN COMPUTING, COMMUNICATIONS AND ELECTRICAL ENGINEERING (ICETT), 2016,
  • [45] Periodic frequent subgraph mining in dynamic graphs
    Cai, Jiayu
    Chen, Zhaoming
    Chen, Guoting
    Gan, Wensheng
    Broustet, Amael
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025,
  • [46] FS3: A Sampling based method for top-k Frequent Subgraph Mining
    Saha, Tanay Kumar
    Al Hasan, Mohammad
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [47] Using and learning semantics in frequent subgraph mining
    Berendt, Bettina
    ADVANCES IN WEB MINING AND WEB USAGE ANALYSIS, 2006, 4198 : 18 - 38
  • [48] A SURVEY ON MAPREDUCE USING FREQUENT SUBGRAPH MINING
    Gokilavani, M.
    Anitha, B.
    Jayanthi, R.
    IIOAB JOURNAL, 2016, 7 (09) : 584 - 591
  • [49] Frequent Subgraph Mining Based on the Automorphism Mapping
    Gao, Zhengkang
    Shang, Li
    Jian, Yujiao
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1518 - 1522
  • [50] FS3: A Sampling Based Method for Top-k Frequent Subgraph Mining
    Saha, Tanay Kumar
    Al Hasan, Mohammad
    STATISTICAL ANALYSIS AND DATA MINING, 2015, 8 (04) : 245 - 261