Counting frequent patterns in large labeled graphs: a hypergraph-based approach

被引:1
|
作者
Meng, Jinghan [1 ]
Pitaksirianan, Napath [1 ]
Tu, Yi-Cheng [1 ]
机构
[1] Univ S Florida, 4202 E Fowler Ave, Tampa, FL 33620 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Data mining; Graph mining; Support measure; Hypergraph; EFFICIENT ALGORITHM; SUBGRAPH;
D O I
10.1007/s10618-020-00686-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the popularity of graph databases has grown rapidly. This paper focuses on single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. In this paper, we propose a novel framework for designing support measures that brings together existing minimum-image-based and overlap-graph-based support measures. Our framework is built on the concept of occurrence/instance hypergraphs. Based on such, we are able to design a series of new support measures: minimum instance (MI) measure, and minimum vertex cover (MVC) measure, that combine the advantages of existing measures. More importantly, we show that the existing minimum-image-based support measure is an upper bound of the MI measure, which is also linear-time computable and results in counts that are close to number of instances of a pattern. We show that not only most major existing support measures and new measures proposed in this paper can be mapped into the new framework, but also they occupy different locations of the frequency spectrum. By taking advantage of the new framework, we discover that MVC can be approximated to a constant factor (in terms of number of pattern nodes) in polynomial time. In contrast to common belief, we demonstrate that the state-of-the-art overlap-graph-based maximum independent set (MIS) measure also has constant approximation algorithms. We further show that using standard linear programming and semidefinite programming techniques, polynomial-time relaxations for both MVC and MIS measures can be developed and their counts stand between MVC and MIS. In addition, we point out that MVC, MIS, and their relaxations are bounded within constant factor. In summary, all major support measures are unified in the new hypergraph-based framework which helps reveal their bounding relations and hardness properties.
引用
收藏
页码:980 / 1021
页数:42
相关论文
共 50 条
  • [31] Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning
    Kolountzakis, Mihail N.
    Miller, Gary L.
    Peng, Richard
    Tsourakakis, Charalampos E.
    INTERNET MATHEMATICS, 2012, 8 (1-2) : 161 - 185
  • [32] An Algorithm Based on Dataflow Model for Mining Frequent Patterns from a Large Graph
    Tang X.-C.
    Fan X.-F.
    Zhou J.-W.
    Li Z.-H.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (07): : 1293 - 1311
  • [33] A fundamental approach to discover closed periodic-frequent patterns in very large temporal databases
    Pamalla, Veena
    Rage, Uday Kiran
    Penugonda, Ravikumar
    Palla, Likhitha
    Hayamizu, Yuto
    Goda, Kazuo
    Toyoda, Masashi
    Zettsu, Koji
    Sourabh, Shrivastava
    APPLIED INTELLIGENCE, 2023, 53 (22) : 27344 - 27373
  • [34] A fundamental approach to discover closed periodic-frequent patterns in very large temporal databases
    Veena Pamalla
    Uday Kiran Rage
    Ravikumar Penugonda
    Likhitha Palla
    Yuto Hayamizu
    Kazuo Goda
    Masashi Toyoda
    Koji Zettsu
    Shrivastava Sourabh
    Applied Intelligence, 2023, 53 : 27344 - 27373
  • [35] Graph partitioning MapReduce-based algorithms for counting triangles in large-scale graphs
    Sharafeldeen, Ahmed
    Alrahmawy, Mohammed
    Elmougy, Samir
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [36] Graph partitioning MapReduce-based algorithms for counting triangles in large-scale graphs
    Ahmed Sharafeldeen
    Mohammed Alrahmawy
    Samir Elmougy
    Scientific Reports, 13
  • [37] A SAT-Based Approach for Discovering Frequent, Closed and Maximal Patterns in a Sequence
    Coquery, Emmanuel
    Jabbour, Said
    Sais, Lakhdar
    Salhi, Yakoub
    20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 258 - +
  • [38] Bridging Equational Properties and Patterns on Graphs: an AI-Based Approach
    Keskin, Oguzhan
    Lupidi, Alisia
    Giannini, Francesco
    Fioravanti, Stefano
    Magister, Lucie Charlotte
    Barbiero, Pietro
    Lio, Pietro
    TOPOLOGICAL, ALGEBRAIC AND GEOMETRIC LEARNING WORKSHOPS 2023, VOL 221, 2023, 221
  • [39] Top-k Subgraph Query Based on Frequent Structure in Large-Scale Dynamic Graphs
    Shan, Xiaohuan
    Wang, Guangxiang
    Ding, Linlin
    Song, Baoyan
    Xu, Yan
    IEEE ACCESS, 2018, 6 : 78471 - 78482
  • [40] An Attention-Based Approach to Rule Learning in Large Knowledge Graphs
    Li, Minghui
    Wang, Kewen
    Wang, Zhe
    Wu, Hong
    Feng, Zhiyong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS, 2021, 12680 : 154 - 165