Summarizing Probabilistic Frequent Patterns: A Fast Approach

被引:0
|
作者
Liu, Chunyang [1 ]
Chen, Ling [1 ]
Zhang, Chengqi [1 ]
机构
[1] Univ Technol Sydney, QCIS, Sydney, NSW, Australia
基金
澳大利亚研究理事会;
关键词
Pattern Summarization; Uncertain Data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining probabilistic frequent patterns from uncertain data has received a great deal of attention in recent years due to the wide applications. However, probabilistic frequent pattern mining suffers from the problem that an exponential number of result patterns are generated, which seriously hinders further evaluation and analysis. In this paper, we focus on the problem of mining probabilistic representative frequent patterns (P-RFP), which is the minimal set of patterns with adequately high probability to represent all frequent patterns. Observing the bottleneck in checking whether a pattern can probabilistically represent another, which involves the computation of a joint probability of the supports of two patterns, we introduce a novel approximation of the joint probability with both theoretical and empirical proofs. Based on the approximation, we propose an Approximate P-RFP Mining (APM) algorithm, which effectively and efficiently compresses the set of probabilistic frequent patterns. To our knowledge, this is the first attempt to analyze the relationship between two probabilistic frequent patterns through an approximate approach. Our experiments on both synthetic and real-world datasets demonstrate that the APM algorithm accelerates P-RFP mining dramatically, orders of magnitudes faster than an exact solution. Moreover, the error rate of APM is guaranteed to be very small when the database contains hundreds transactions, which further affirms APM is a practical solution for summarizing probabilistic frequent patterns.
引用
收藏
页码:527 / 535
页数:9
相关论文
共 50 条
  • [41] Frequent itemsets as meaningful events in graphs for summarizing biomedical texts
    Moradi, Milad
    2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2018, : 135 - 140
  • [42] A Probabilistic Sketch for Summarizing Cold Items of Data Streams
    Liu, Yongqiang
    Xie, Xike
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (02) : 1287 - 1302
  • [43] Efficient Algorithms for Summarizing Graph Patterns
    Li, Jianzhong
    Liu, Yong
    Gao, Hong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (09) : 1388 - 1405
  • [44] Fast non-supervised matching: A probabilistic approach
    Pokrandt, P
    CAR '96: COMPUTER ASSISTED RADIOLOGY, 1996, 1124 : 306 - 310
  • [45] Probabilistic Frequent Subtree Kernels
    Welke, Pascal
    Horvath, Tamas
    Wrobel, Stefan
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, 2016, 9607 : 179 - 193
  • [46] Skip Search Approach for Mining Probabilistic Frequent Itemsets from Uncertain Data
    Shintani, Takahiko
    Ohmori, Tadashi
    Fujita, Hideyuki
    KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 174 - 180
  • [47] A new approach to generate frequent patterns from enterprise databases
    Liu, YC
    Hsu, PY
    PATTERN RECOGNITION AND DATA MINING, PT 1, PROCEEDINGS, 2005, 3686 : 371 - 380
  • [48] An efficient approach with memory indexing for discovering frequent sequential patterns
    Dan, Cao
    Peng, Hui-Li
    Zhang, Xiao-Jian
    Du, Xing-Zheng
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1001 - 1006
  • [49] Share-Inherit: A novel approach for mining frequent patterns
    Lin, Xiaoyong
    Zhu, Qunxiong
    2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 2712 - 2717
  • [50] A new approach for efficiently mining frequent weighted utility patterns
    Ham Nguyen
    Nguyen Le
    Huong Bui
    Tuong Le
    APPLIED INTELLIGENCE, 2023, 53 (01) : 121 - 140