Counting frequent patterns in large labeled graphs: a hypergraph-based approach

被引:1
|
作者
Meng, Jinghan [1 ]
Pitaksirianan, Napath [1 ]
Tu, Yi-Cheng [1 ]
机构
[1] Univ S Florida, 4202 E Fowler Ave, Tampa, FL 33620 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Data mining; Graph mining; Support measure; Hypergraph; EFFICIENT ALGORITHM; SUBGRAPH;
D O I
10.1007/s10618-020-00686-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the popularity of graph databases has grown rapidly. This paper focuses on single-graph as an effective model to represent information and its related graph mining techniques. In frequent pattern mining in a single-graph setting, there are two main problems: support measure and search scheme. In this paper, we propose a novel framework for designing support measures that brings together existing minimum-image-based and overlap-graph-based support measures. Our framework is built on the concept of occurrence/instance hypergraphs. Based on such, we are able to design a series of new support measures: minimum instance (MI) measure, and minimum vertex cover (MVC) measure, that combine the advantages of existing measures. More importantly, we show that the existing minimum-image-based support measure is an upper bound of the MI measure, which is also linear-time computable and results in counts that are close to number of instances of a pattern. We show that not only most major existing support measures and new measures proposed in this paper can be mapped into the new framework, but also they occupy different locations of the frequency spectrum. By taking advantage of the new framework, we discover that MVC can be approximated to a constant factor (in terms of number of pattern nodes) in polynomial time. In contrast to common belief, we demonstrate that the state-of-the-art overlap-graph-based maximum independent set (MIS) measure also has constant approximation algorithms. We further show that using standard linear programming and semidefinite programming techniques, polynomial-time relaxations for both MVC and MIS measures can be developed and their counts stand between MVC and MIS. In addition, we point out that MVC, MIS, and their relaxations are bounded within constant factor. In summary, all major support measures are unified in the new hypergraph-based framework which helps reveal their bounding relations and hardness properties.
引用
收藏
页码:980 / 1021
页数:42
相关论文
共 50 条
  • [21] Energy Efficiency Optimization in U2U Multicast Networks: A Hypergraph-Based Matching Approach
    Jia, Qian
    Zhang, Yuli
    Chen, Runfeng
    Liu, Dianxiong
    Wang, Haichao
    Jiao, Yutao
    Li, Guoxin
    IEEE SYSTEMS JOURNAL, 2023, 17 (02): : 2189 - 2200
  • [22] A Bounded and Adaptive Memory-Based Approach to Mine Frequent Patterns From Very Large Databases
    Adnan, Muhaimenul
    Alhajj, Reda
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2011, 41 (01): : 154 - 172
  • [23] Predicting individual socioeconomic status from mobile phone data: a semi-supervised hypergraph-based factor graph approach
    Zhao, Tao
    Huang, Hong
    Yao, Xiaoming
    Luo, Jar-der
    Fu, Xiaoming
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2020, 9 (03) : 361 - 372
  • [24] Predicting individual socioeconomic status from mobile phone data: a semi-supervised hypergraph-based factor graph approach
    Tao Zhao
    Hong Huang
    Xiaoming Yao
    Jar-der Luo
    Xiaoming Fu
    International Journal of Data Science and Analytics, 2020, 9 : 361 - 372
  • [25] Mining maximum frequent access patterns in web logs based on unique labeled tree
    Zhang, Ling
    Yin, Ran-ping
    Zhan, Yu-bin
    WEB INFORMATION SYSTEMS - WISE 2006 WORKSHOPS, PROCEEDINGS, 2006, 4256 : 73 - 82
  • [26] A Global Online Handwriting Recognition Approach Based on Frequent Patterns
    Gmati, Chekib
    Amiri, Hamid
    ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2018, 8 (03) : 2887 - 2891
  • [27] Mining Top-k Frequent Patterns in Large Geosocial Networks: A Mnie-Based Extension Approach
    Zhou, Changben
    Xu, Jian
    Jiang, Ming
    Tang, Donghang
    Wang, Sheng
    IEEE ACCESS, 2023, 11 : 27662 - 27675
  • [28] An Efficient Count Based Transaction Reduction Approach For Mining Frequent Patterns
    Vijayalakshmi, V.
    Pethalakshmi, A.
    GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 : 52 - 61
  • [29] Novel graph classification approach based on frequent closed emerging patterns
    School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
    Jisuanji Yanjiu yu Fazhan, 2007, 7 (1169-1176):
  • [30] Efficient Triangle Counting in Large Graphs via Degree-Based Vertex Partitioning
    Kolountzakis, Mihail N.
    Miller, Gary L.
    Peng, Richard
    Tsourakakis, Charalampos E.
    ALGORITHMS AND MODELS FOR THE WEB GRAPH, 2010, 6516 : 15 - +