Mining sequential patterns from probabilistic databases

被引:21
|
作者
Muzammal, Muhammad [1 ]
Raman, Rajeev [2 ]
机构
[1] Bahria Univ, Dept Comp Sci, Islamabad, Pakistan
[2] Univ Leicester, Dept Comp Sci, Leicester, Leics, England
关键词
Mining uncertain data; Sequential pattern mining; Probabilistic databases; FREQUENT ITEMSETS; GROWTH;
D O I
10.1007/s10115-014-0766-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper considers the problem of sequential pattern mining (SPM) in probabilistic databases. Specifically, we consider SPM in situations where there is uncertainty in associating an event with a source, model this kind of uncertainty in the probabilistic database framework and consider the problem of enumerating all sequences whose expected support is sufficiently large. We give an algorithm based on dynamic programming to compute the expected support of a sequential pattern. Next, we propose three algorithms for mining sequential patterns from probabilistic databases. The first two algorithms are based on the candidate generation framework-one each based on a breadth-first (similar to GSP) and a depth-first (similar to SPAM) exploration of the search space. The third one is based on the pattern-growth framework (similar to PrefixSpan). We propose optimizations that mitigate the effects of the expensive dynamic programming computation step. We give an empirical evaluation of the probabilistic SPM algorithms and the optimizations and demonstrate the scalability of the algorithms in terms of CPU time and the memory usage. We also demonstrate the effectiveness of the probabilistic SPM framework in extracting meaningful sequences in the presence of noise.
引用
收藏
页码:325 / 358
页数:34
相关论文
共 50 条
  • [1] Mining Sequential Patterns from Probabilistic Databases
    Muzammal, Muhammad
    Raman, Rajeev
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6635 : 210 - 221
  • [2] Mining sequential patterns from probabilistic databases
    Muhammad Muzammal
    Rajeev Raman
    [J]. Knowledge and Information Systems, 2015, 44 : 325 - 358
  • [3] Mining Sequential Patterns from Probabilistic Databases by Pattern-Growth
    Muzammal, Muhammad
    [J]. ADVANCES IN DATABASES, 2011, 7051 : 118 - 127
  • [4] Mining Integrated Sequential Patterns From Multiple Databases
    Ezeife, Christie, I
    Aravindan, Vignesh
    Chaturvedi, Ritu
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2020, 16 (01) : 1 - 21
  • [5] Mining integrated sequential patterns from multiple databases
    Ezeife, Christie I.
    Aravindan, Vignesh
    Chaturvedi, Ritu
    [J]. International Journal of Data Warehousing and Mining, 2020, 16 (01): : 1 - 21
  • [6] Mining dependent patterns in probabilistic databases
    Zhang, SC
    Zhang, CQ
    Yu, JX
    [J]. CYBERNETICS AND SYSTEMS, 2004, 35 (04) : 399 - 424
  • [7] Mining Probabilistic Frequent Spatio-Temporal Sequential Patterns with Gap Constraints from Uncertain Databases
    Li, Yuxuan
    Bailey, James
    Kulik, Lars
    Pei, Jian
    [J]. 2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 448 - 457
  • [8] A fast algorithm for mining sequential patterns from large databases
    Chen, N
    Chen, A
    Zhou, LX
    Liu, L
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2001, 16 (04) : 359 - 370
  • [9] A Fast Algorithm for Mining Sequential Patterns from Large Databases
    陈宁
    陈安
    周龙骧
    刘鲁
    [J]. Journal of Computer Science & Technology, 2001, (04) : 359 - 370
  • [10] A fast algorithm for mining sequential patterns from large databases
    Ning Chen
    An Chen
    Longxiang Zhou
    Lu Liu
    [J]. Journal of Computer Science and Technology, 2001, 16 : 359 - 370