gApprox: Mining frequent approximate patterns from a massive network

被引:46
|
作者
Chen, Chen [1 ]
Yan, Xifeng [2 ]
Zhu, Feida [1 ]
Han, Jiawei [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] IBM T J Watson Res Ctr, Yorktown Hts, NY USA
关键词
D O I
10.1109/ICDM.2007.36
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, there arise a large number of graphs with massive sizes and complex structures in many new applications, such as biological networks, social networks, and the Web, demanding powerful data mining methods. Due to inherent noise or data diversity, it is crucial to address the issue of approximation, if one wants to mine patterns that are potentially interesting with tolerable variations. In this paper, we investigate the problem of mining frequent approximate patterns from a massive network and propose a method called gApprox. gApprox not only finds approximate network patterns, which is the key for many knowledge discovery applications on structural data, but also enriches the library of graph mining methodologies by introducing several novel techniques such as: (1) a complete and redundancy-free strategy to explore the new pattern space faced by gApprox; and (2) transform "frequent in an approximate sense" into an anti-monotonic constraint so that it can be pushed deep into the mining process. Systematic empirical studies on both real and synthetic data sets show that frequent approximate patterns mined from the worm protein-protein interaction network are biologically interesting and gApprox is both effective and efficient.
引用
收藏
页码:445 / +
页数:2
相关论文
共 50 条
  • [31] Mining Supplemental Frequent Patterns
    Liu, Yintian
    Liu, Yingming
    Zeng, Tao
    Xu, Kaikuo
    Tang, Rong
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 158 - +
  • [32] An approximate approach for mining recently frequent itemsets from data streams
    Koh, Jia-Ling
    Shin, Shu-Ning
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 352 - 362
  • [33] A New Approximate Method For Mining Frequent Itemsets From Big Data *
    Valiullin, Timur
    Huang, Zhexue
    Wei, Chenghao
    Yin, Jianfei
    Wu, Dingming
    Egorova, Iuliia
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2021, 18 (03) : 641 - 656
  • [34] Mining Frequent Trajectory Patterns in Road Network Based on Similar Trajectory
    Qiu, Ming
    Pi, Dechang
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 46 - 57
  • [35] A distributed algorithm based on competitive neural network for mining frequent patterns
    Dong, YH
    Tai, XY
    Zhao, JY
    [J]. PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 499 - 503
  • [36] An efficient algorithm for approximate frequent intemset mining
    Uppal, Veepu
    [J]. International Journal of Database Theory and Application, 2015, 8 (03): : 279 - 288
  • [37] Mining representative approximate frequent coexpression subnetworks
    Seo, San Ha
    Salem, Saeed
    [J]. ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
  • [38] Efficiently Mining Frequent Itemsets on Massive Data
    Han, Xixian
    Liu, Xianmin
    Chen, Jian
    Lai, Guojun
    Gao, Hong
    Li, Jianzhong
    [J]. IEEE ACCESS, 2019, 7 : 31409 - 31421
  • [39] Mining clique frequent approximate subgraphs from multi-graph collections
    Niusvel Acosta-Mendoza
    Jesús Ariel Carrasco-Ochoa
    José Francisco Martínez-Trinidad
    Andrés Gago-Alonso
    José Eladio Medina-Pagola
    [J]. Applied Intelligence, 2020, 50 : 878 - 892
  • [40] Mining frequent stem patterns from unaligned RNA sequences
    Hamada, Michiaki
    Tsuda, Koji
    Kudo, Taku
    Kin, Taishin
    Asai, Kiyoshi
    [J]. BIOINFORMATICS, 2006, 22 (20) : 2480 - 2487