CloFAST: closed sequential pattern mining using sparse and vertical id-lists

被引:0
|
作者
Fabio Fumarola
Pasqua Fabiana Lanotte
Michelangelo Ceci
Donato Malerba
机构
[1] University of Bari “A. Moro”,Department of Computer Science
来源
Knowledge and Information Systems | 2016年 / 48卷
关键词
Sequential pattern mining; Closed sequences; Data mining; Itemset;
D O I
暂无
中图分类号
学科分类号
摘要
Sequential pattern mining is a computationally challenging task since algorithms have to generate and/or test a combinatorially explosive number of intermediate subsequences. In order to reduce complexity, some researchers focus on the task of mining closed sequential patterns. This not only results in increased efficiency, but also provides a way to compact results, while preserving the same expressive power of patterns extracted by means of traditional (non-closed) sequential pattern mining algorithms. In this paper, we present CloFAST, a novel algorithm for mining closed frequent sequences of itemsets. It combines a new data representation of the dataset, based on sparse id-lists and vertical id-lists, whose theoretical properties are studied in order to fast count the support of sequential patterns, with a novel one-step technique both to check sequence closure and to prune the search space. Contrary to almost all the existing algorithms, which iteratively alternate itemset extension and sequence extension, CloFAST proceeds in two steps. Initially, all closed frequent itemsets are mined in order to obtain an initial set of sequences of size 1. Then, new sequences are generated by directly working on the sequences, without mining additional frequent itemsets. A thorough performance study with both real-world and artificially generated datasets empirically proves that CloFAST outperforms the state-of-the-art algorithms, both in time and memory consumption, especially when mining long closed sequences.
引用
收藏
页码:429 / 463
页数:34
相关论文
共 50 条
  • [31] Action Model Acquisition Using Sequential Pattern Mining
    Arora, Ankuj
    Fiorino, Humbert
    Pellier, Damien
    Pesty, Sylvie
    KI 2017: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2017, 10505 : 286 - 292
  • [32] Keyphrase Extraction Using Sequential Pattern Mining and Entropy
    Wang, Qingren
    Sheng, Victor S.
    Hu, Chenyi
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (IEEE ICBK 2017), 2017, : 88 - 95
  • [33] Detection of Malicious Transactions using Frequent Closed Sequential Pattern Mining and Modified Particle Swarm Optimization Clustering
    Jindal, Rajni
    Singh, Indu
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [34] Trajectory Pattern Mining Using Sequential Pattern Mining and K-Means for Predicting Future Location
    Kautsar, G.
    Akbar, S.
    1ST INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2016 : APPLIED INFORMATICS TOWARD SMART ENVIRONMENT, PEOPLE, AND SOCIETY, 2017, 801
  • [35] IC-BIDE: Intensity Constraint-based Closed Sequential Pattern Mining for Coding Pattern Extraction
    Takei, Hiromasa
    Yamana, Hayato
    2013 IEEE 27TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2013, : 976 - 983
  • [36] A Sequential Pattern Mining Using Dynamic Weight in Stream Environment
    Choi, Pilsun
    Kim, Hwan
    Hwang, Buhyun
    2014 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2014), 2014, : 507 - 511
  • [37] Sequential Pattern Mining using PrefixSpan with Pseudoprojection and Separator Database
    Saputra, Dhany
    Rambli, Dayang Rohaya Awang
    Mean, Foong Oi
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 1242 - 1248
  • [38] Contiguous item sequential pattern mining using UpDown Tree
    Chen, Jinlin
    INTELLIGENT DATA ANALYSIS, 2008, 12 (01) : 25 - 49
  • [39] Metamorphic Malware Behavior Analysis Using Sequential Pattern Mining
    Nawaz, M. Saqib
    Fournier-Viger, Philippe
    Nawaz, M. Zohaib
    Chen, Guoting
    Wu, Youxi
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2021, 1525 : 90 - 103
  • [40] Mining Sequential Pattern Using DF2Ls
    Xu Yusheng
    Zhang Lanhui
    Ma Zhixin
    Li Lian
    Chen, Xiaoyun
    Dillon, Tharam S.
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 600 - +