CMRules: Mining sequential rules common to several sequences

被引:82
|
作者
Fournier-Viger, Philippe [1 ]
Faghihi, Usef [2 ]
Nkambou, Roger [3 ]
Nguifo, Engelbert Mephu [4 ,5 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 70101, Taiwan
[2] Univ Memphis, Dept Comp Sci, Memphis, TN 38152 USA
[3] Univ Quebec, Dept Comp Sci, Montreal, PQ H3C 3P8, Canada
[4] Univ Blaise Pascal, Clermont Univ, LIMOS, F-63000 Clermont Ferrand, France
[5] LIMOS, CNRS, UMR 6158, F-63173 Aubiere, France
基金
加拿大自然科学与工程研究理事会;
关键词
Sequential rule mining; Sequential pattern mining; Association rule mining; Sequence database; Educational data mining; PATTERNS;
D O I
10.1016/j.knosys.2011.07.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sequential rule mining is an important data mining task used in a wide range of applications. However, current algorithms for discovering sequential rules common to several sequences use very restrictive definitions of sequential rules, which make them unable to recognize that similar rules can describe a same phenomenon. This can have many undesirable effects such as (1) similar rules that are rated differently, (2) rules that are not found because they are considered uninteresting when taken individually, (3) and rules that are too specific, which makes them less likely to be used for making predictions. In this paper, we address these problems by proposing a more general form of sequential rules such that items in the antecedent and in the consequent of each rule are unordered. We propose an algorithm named CMRules for mining this form of rules. The algorithm proceeds by first finding association rules to prune the search space for items that occur jointly in many sequences. Then it eliminates association rules that do not meet the minimum confidence and support thresholds according to the sequential ordering. We evaluate the performance of CMRules in three different ways. First, we provide an analysis of its time complexity. Second, we compare its performance (in terms of execution time, memory usage and scalability) with an adaptation of an algorithm from the literature that we name CMDeo. For this comparison, we use three real-life public datasets, which have different characteristics and represent three kinds of data. In many cases, results show that CMRules is faster and has a better scalability for low support thresholds than CMDeo. Lastly, we report a successful application of the algorithm in a tutoring agent. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:63 / 76
页数:14
相关论文
共 50 条
  • [1] Mining Partially-Ordered Sequential Rules Common to Multiple Sequences
    Fournier-Viger, Philippe
    Wu, Cheng-Wei
    Tseng, Vincent S.
    Cao, Longbing
    Nkambou, Roger
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (08) : 2203 - 2216
  • [2] A Probabilistic Method for Mining Sequential Rules from Sequences of LBS Cloaking Regions
    Zhang, Haitao
    Chen, Zewei
    Liu, Zhao
    Zhu, Yunhong
    Wu, Chenxue
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2017, 13 (01) : 36 - 50
  • [3] Mining hybrid sequential patterns and sequential rules
    Chen, YL
    Chen, SS
    Hsu, PY
    INFORMATION SYSTEMS, 2002, 27 (05) : 345 - 362
  • [4] Mining sequential rules with itemset constraints
    Trang Van
    Bac Le
    Applied Intelligence, 2021, 51 : 7208 - 7220
  • [5] Mining sequential rules with itemset constraints
    Van, Trang
    Le, Bac
    APPLIED INTELLIGENCE, 2021, 51 (10) : 7208 - 7220
  • [6] Mining Association Rules in Long Sequences
    Cule, Boris
    Goethals, Bart
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I, PROCEEDINGS, 2010, 6118 : 300 - 309
  • [7] Mining of correlated rules in genome sequences
    Lin, L
    Wong, L
    Leong, TY
    Lai, PS
    AMIA 2002 SYMPOSIUM, PROCEEDINGS: BIOMEDICAL INFORMATICS: ONE DISCIPLINE, 2002, : 1084 - 1084
  • [8] Mining association rules in temporal sequences
    Bouandas, Khellaf
    Osmani, Aomar
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 610 - 615
  • [9] Mining Negative Sequential Rules from Negative Sequential Patterns
    Sun, Chuanhou
    Jiang, Xiaoqi
    Dong, Xiangjun
    Xu, Tiantian
    Zhao, Long
    Li, Zhao
    Zhao, Yuhai
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT I, 2022, : 459 - 475
  • [10] Mining Fuzzy Common Sequential Rules with Fuzzy Time-Interval in Quantitative Sequence Databases
    Thanh Do Van
    Phuong Truong Duc
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2020, 28 (06) : 957 - 979