Summarizing Probabilistic Frequent Patterns: A Fast Approach

被引:0
|
作者
Liu, Chunyang [1 ]
Chen, Ling [1 ]
Zhang, Chengqi [1 ]
机构
[1] Univ Technol Sydney, QCIS, Sydney, NSW, Australia
基金
澳大利亚研究理事会;
关键词
Pattern Summarization; Uncertain Data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining probabilistic frequent patterns from uncertain data has received a great deal of attention in recent years due to the wide applications. However, probabilistic frequent pattern mining suffers from the problem that an exponential number of result patterns are generated, which seriously hinders further evaluation and analysis. In this paper, we focus on the problem of mining probabilistic representative frequent patterns (P-RFP), which is the minimal set of patterns with adequately high probability to represent all frequent patterns. Observing the bottleneck in checking whether a pattern can probabilistically represent another, which involves the computation of a joint probability of the supports of two patterns, we introduce a novel approximation of the joint probability with both theoretical and empirical proofs. Based on the approximation, we propose an Approximate P-RFP Mining (APM) algorithm, which effectively and efficiently compresses the set of probabilistic frequent patterns. To our knowledge, this is the first attempt to analyze the relationship between two probabilistic frequent patterns through an approximate approach. Our experiments on both synthetic and real-world datasets demonstrate that the APM algorithm accelerates P-RFP mining dramatically, orders of magnitudes faster than an exact solution. Moreover, the error rate of APM is guaranteed to be very small when the database contains hundreds transactions, which further affirms APM is a practical solution for summarizing probabilistic frequent patterns.
引用
收藏
页码:527 / 535
页数:9
相关论文
共 50 条
  • [21] A novel approach for mining maximal frequent patterns
    Bay Vo
    Sang Pham
    Tuong Le
    Deng, Zhi-Hong
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 : 178 - 186
  • [22] Frequent patterns in ETL workflows: An empirical approach
    Theodorou, Vasileios
    Abello, Alberto
    Thiele, Maik
    Lehner, Wolfgang
    DATA & KNOWLEDGE ENGINEERING, 2017, 112 : 1 - 16
  • [23] A fast and distributed algorithm for mining frequent patterns in congested networks
    Kawuu W. Lin
    Sheng-Hao Chung
    Chun-Cheng Lin
    Computing, 2016, 98 : 235 - 256
  • [24] A fast and distributed algorithm for mining frequent patterns in congested networks
    Lin, Kawuu W.
    Chung, Sheng-Hao
    Lin, Chun-Cheng
    COMPUTING, 2016, 98 (03) : 235 - 256
  • [25] Towards fast and memory efficient discovery of periodic frequent patterns
    Nofong, Vincent Mwintieru
    Wondoh, John
    JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2019, 3 (04) : 480 - 493
  • [26] Efficiently mining maximal frequent patterns: Fast-miner
    Dewsnip, MJ
    Mahoui, M
    DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS AND TECHNOLOGY III, 2001, 4384 : 157 - 162
  • [27] An Efficient and Fast Algorithm for Mining Frequent Patterns on Multiple Biosequences
    Liu, Wei
    Chen, Ling
    COMPUTER AND COMPUTING TECHNOLOGIES IN AGRICULTURE IV, PT 1, 2011, 344 : 178 - 194
  • [28] Summarizing significant subgraphs by probabilistic logic programming
    Bellodi, Elena
    Satoh, Ken
    Sugiyama, Mahito
    INTELLIGENT DATA ANALYSIS, 2019, 23 (06) : 1299 - 1312
  • [29] Novel Approach for Frequent Pattern Algorithm for Maximizing Frequent Patterns in Effective Time
    Dubey, Akhilesh
    Mehta, Aayush
    Saxena, Akriti
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2016, 16 (05): : 109 - 112
  • [30] Summarizing Uncertain Transaction Databases by Probabilistic Tiles
    Liu, Chunyang
    Chen, Ling
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 4375 - 4382