Summarizing Probabilistic Frequent Patterns: A Fast Approach

被引:0
|
作者
Liu, Chunyang [1 ]
Chen, Ling [1 ]
Zhang, Chengqi [1 ]
机构
[1] Univ Technol Sydney, QCIS, Sydney, NSW, Australia
基金
澳大利亚研究理事会;
关键词
Pattern Summarization; Uncertain Data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining probabilistic frequent patterns from uncertain data has received a great deal of attention in recent years due to the wide applications. However, probabilistic frequent pattern mining suffers from the problem that an exponential number of result patterns are generated, which seriously hinders further evaluation and analysis. In this paper, we focus on the problem of mining probabilistic representative frequent patterns (P-RFP), which is the minimal set of patterns with adequately high probability to represent all frequent patterns. Observing the bottleneck in checking whether a pattern can probabilistically represent another, which involves the computation of a joint probability of the supports of two patterns, we introduce a novel approximation of the joint probability with both theoretical and empirical proofs. Based on the approximation, we propose an Approximate P-RFP Mining (APM) algorithm, which effectively and efficiently compresses the set of probabilistic frequent patterns. To our knowledge, this is the first attempt to analyze the relationship between two probabilistic frequent patterns through an approximate approach. Our experiments on both synthetic and real-world datasets demonstrate that the APM algorithm accelerates P-RFP mining dramatically, orders of magnitudes faster than an exact solution. Moreover, the error rate of APM is guaranteed to be very small when the database contains hundreds transactions, which further affirms APM is a practical solution for summarizing probabilistic frequent patterns.
引用
收藏
页码:527 / 535
页数:9
相关论文
共 50 条
  • [1] Summarizing frequent patterns using profiles
    Cong, Gao
    Cui, Bin
    Li, Yingxin
    Zhang, Zonghong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 171 - 186
  • [2] A probabilistic approach to mining fuzzy frequent patterns
    Gyenesei, A
    Teuhola, J
    CLASSIFICATION AND CLUSTERING FOR KNOWLEDGE DISCOVERY, 2005, 4 : 73 - 89
  • [3] Extracting and summarizing the frequent emerging graph patterns from a dataset of graphs
    Poezevara, Guillaume
    Cuissart, Bertrand
    Cremilleux, Bruno
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2011, 37 (03) : 333 - 353
  • [4] Extracting and summarizing the frequent emerging graph patterns from a dataset of graphs
    Guillaume Poezevara
    Bertrand Cuissart
    Bruno Crémilleux
    Journal of Intelligent Information Systems, 2011, 37 : 333 - 353
  • [5] Fast Approximation of Probabilistic Frequent Closed Itemsets
    Peterson, Erich A.
    Tang, Peiyi
    PROCEEDINGS OF THE 50TH ANNUAL ASSOCIATION FOR COMPUTING MACHINERY SOUTHEAST CONFERENCE, 2012,
  • [6] A fast algorithm for mining frequent patterns
    Ruan, YL
    Zhang, JJ
    Li, QH
    Yang, SD
    PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1683 - 1686
  • [7] Interactive Mining of Probabilistic Frequent Patterns in Uncertain Databases
    Lin, Ming-Yen
    Fu, Cheng-Tai
    Hsueh, Sue-Chen
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2022, 30 (02) : 263 - 283
  • [8] A Fast Parallel Algorithm for Discovering Frequent Patterns
    Lin, Kawuu W.
    Luo, Yu-Chin
    2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 398 - 403
  • [9] Associative categorization of frequent patterns based on the probabilistic graphical model
    Weiyi Liu
    Kun Yue
    Hui Liu
    Ping Zhang
    Suiye Liu
    Qianyi Wang
    Frontiers of Computer Science, 2014, 8 : 265 - 278
  • [10] Associative categorization of frequent patterns based on the probabilistic graphical model
    Liu, Weiyi
    Yue, Kun
    Liu, Hui
    Zhang, Ping
    Liu, Suiye
    Wang, Qianyi
    FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (02) : 265 - 278