CloFAST: closed sequential pattern mining using sparse and vertical id-lists

被引:0
|
作者
Fabio Fumarola
Pasqua Fabiana Lanotte
Michelangelo Ceci
Donato Malerba
机构
[1] University of Bari “A. Moro”,Department of Computer Science
来源
Knowledge and Information Systems | 2016年 / 48卷
关键词
Sequential pattern mining; Closed sequences; Data mining; Itemset;
D O I
暂无
中图分类号
学科分类号
摘要
Sequential pattern mining is a computationally challenging task since algorithms have to generate and/or test a combinatorially explosive number of intermediate subsequences. In order to reduce complexity, some researchers focus on the task of mining closed sequential patterns. This not only results in increased efficiency, but also provides a way to compact results, while preserving the same expressive power of patterns extracted by means of traditional (non-closed) sequential pattern mining algorithms. In this paper, we present CloFAST, a novel algorithm for mining closed frequent sequences of itemsets. It combines a new data representation of the dataset, based on sparse id-lists and vertical id-lists, whose theoretical properties are studied in order to fast count the support of sequential patterns, with a novel one-step technique both to check sequence closure and to prune the search space. Contrary to almost all the existing algorithms, which iteratively alternate itemset extension and sequence extension, CloFAST proceeds in two steps. Initially, all closed frequent itemsets are mined in order to obtain an initial set of sequences of size 1. Then, new sequences are generated by directly working on the sequences, without mining additional frequent itemsets. A thorough performance study with both real-world and artificially generated datasets empirically proves that CloFAST outperforms the state-of-the-art algorithms, both in time and memory consumption, especially when mining long closed sequences.
引用
收藏
页码:429 / 463
页数:34
相关论文
共 50 条
  • [41] Trend analysis of product function using sequential pattern mining
    Yu, Li
    Zhang, Zaifang
    COMPUTER AND INFORMATION TECHNOLOGY, 2014, 519-520 : 736 - +
  • [42] Analysis and Classification of Fake News Using Sequential Pattern Mining
    Nawaz, M. Zohaib
    Nawaz, M. Saqib
    Fournier-Viger, Philippe
    He, Yulin
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 942 - 963
  • [43] A Novel Approach for Sequential Pattern Mining By Using Genetic Algorithm
    Saravanan, M.
    Jyothi, V. L.
    2014 INTERNATIONAL CONFERENCE ON CONTROL, INSTRUMENTATION, COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICCICCT), 2014, : 284 - 288
  • [44] Improved sequential pattern mining using an extended bitmap representation
    Wu, CL
    Koh, JL
    An, PY
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2005, 3588 : 776 - 785
  • [45] A sequential pattern mining algorithm using rough set theory
    Kaneiw, Ken
    Kudo, Yasuo
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2011, 52 (06) : 881 - 893
  • [46] High Utility Sequential Pattern Mining Using Intelligent Technique
    Joseph, Daison
    Bansal, Gaurav Kumar
    Asha, P.
    PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT ,POWER AND COMPUTING TECHNOLOGIES (ICCPCT), 2017,
  • [47] Analysis of Learning Behavior in a Programming Course using Process Mining and Sequential Pattern Mining
    Real, Eduardo Machado
    Pimentel, Edson Pinheiro
    Braga, Juliana Cristina
    2021 IEEE FRONTIERS IN EDUCATION CONFERENCE (FIE 2021), 2021,
  • [48] Mining Closed and Multi-Supports-Based Sequential Pattern in High-Dimensional Dataset
    Han, Meng
    Wang, Zhihai
    Yuan, Jidong
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2015, 12 (04) : 360 - 369
  • [49] VM-NSP: Vertical Negative Sequential Pattern Mining with Loose Negative Element Constraints
    Wang, Wei
    Cao, Longbing
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2021, 39 (02)
  • [50] Mining frequent closed itemsets using conditional frequent pattern tree
    Singh, SR
    Patra, BK
    Giri, D
    Proceedings of the IEEE INDICON 2004, 2004, : 501 - 504