Counting frequent patterns in large labeled graphs: a hypergraph-based approach

被引:1
|
作者
Meng, Jinghan [1 ]
Pitaksirianan, Napath [1 ]
Tu, Yi-Cheng [1 ]
机构
[1] Univ S Florida, 4202 E Fowler Ave, Tampa, FL 33620 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Data mining; Graph mining; Support measure; Hypergraph; EFFICIENT ALGORITHM; SUBGRAPH;
D O I
10.1007/s10618-020-00686-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the popularity of graph databases has grown rapidly. This paper focuses on single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. In this paper, we propose a novel framework for designing support measures that brings together existing minimum-image-based and overlap-graph-based support measures. Our framework is built on the concept of occurrence/instance hypergraphs. Based on such, we are able to design a series of new support measures: minimum instance (MI) measure, and minimum vertex cover (MVC) measure, that combine the advantages of existing measures. More importantly, we show that the existing minimum-image-based support measure is an upper bound of the MI measure, which is also linear-time computable and results in counts that are close to number of instances of a pattern. We show that not only most major existing support measures and new measures proposed in this paper can be mapped into the new framework, but also they occupy different locations of the frequency spectrum. By taking advantage of the new framework, we discover that MVC can be approximated to a constant factor (in terms of number of pattern nodes) in polynomial time. In contrast to common belief, we demonstrate that the state-of-the-art overlap-graph-based maximum independent set (MIS) measure also has constant approximation algorithms. We further show that using standard linear programming and semidefinite programming techniques, polynomial-time relaxations for both MVC and MIS measures can be developed and their counts stand between MVC and MIS. In addition, we point out that MVC, MIS, and their relaxations are bounded within constant factor. In summary, all major support measures are unified in the new hypergraph-based framework which helps reveal their bounding relations and hardness properties.
引用
收藏
页码:980 / 1021
页数:42
相关论文
共 50 条
  • [1] Counting frequent patterns in large labeled graphs: a hypergraph-based approach
    Jinghan Meng
    Napath Pitaksirianan
    Yi-Cheng Tu
    Data Mining and Knowledge Discovery, 2020, 34 : 980 - 1021
  • [2] A Hypergraph-Based Approach to Feature Selection
    Zhang, Zhihong
    Hancock, Edwin R.
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS: 14TH INTERNATIONAL CONFERENCE, CAIP 2011, PT I, 2011, 6854 : 228 - 235
  • [3] A HYPERGRAPH-BASED INTERCONNECTION NETWORK FOR LARGE MULTICOMPUTERS
    MACKENZIE, LM
    OULDKHAOUA, M
    SUTHERLAND, RJ
    KELLY, T
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 634 : 837 - 838
  • [4] A Hypergraph-Based Modeling Approach for Service Systems
    Li, Mahei Manhai
    Peters, Christoph
    Leimeister, Jan Marco
    ADVANCES IN SERVICE SCIENCE, 2019, : 61 - 72
  • [5] A Hypergraph-based Approach to Affine Parameters Estimation
    Bulo, S. Rota
    Albarelli, A.
    Torsello, A.
    Pelillo, M.
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 3482 - 3485
  • [6] Flexible and Feasible Support Measures for Mining Frequent Patterns in Large Labeled Graphs
    Meng, Jinghan
    Tu, Yi-Cheng
    SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 391 - 402
  • [7] A General Computational Approach for Counting Labeled Graphs
    Goyal, Ravi
    De Gruttola, Victor
    ALGORITHMS, 2023, 16 (01)
  • [8] Generalizing Design of Support Measures for Counting Frequent Patterns in Graphs
    Meng, Jinghan
    Pitaksirianan, Napath
    Tu, Yicheng
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 533 - 542
  • [9] Mining Frequent Neighborhood Patterns in a Large Labeled Graph
    Han, Jialong
    Wen, Ji-Rong
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 259 - 268
  • [10] Knowledge hypergraph-based approach for data integration and querying: Application to Earth Observation
    Masmoudi, Maroua
    Lamine, Sana Ben Abdallah Ben
    Zghal, Hajer Baazaoui
    Archimede, Bernard
    Karray, Mohamed Hedi
    Future Generation Computer Systems, 2021, 115 : 720 - 740