A Proposition for Sequence Mining Using Pattern Structures

被引:6
|
作者
Codocedo, Victor [1 ,3 ]
Bosc, Guillaume [2 ]
Kaytoue, Mehdi [2 ]
Boulicaut, Jean-Francois [2 ]
Napoli, Amedeo [3 ]
机构
[1] Inria Chile, Las Condes, Chile
[2] Univ Lyon, CNRS, INSA Lyon, LIRIS, Lyon, France
[3] Univ Lorraine, INRIA Nancy Grand Est, CNRS, LORIA, Nancy, France
来源
关键词
D O I
10.1007/978-3-319-59271-8_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article we present a novel approach to rare sequence mining using pattern structures. Particularly, we are interested in mining closed sequences, a type of maximal sub-element which allows providing a succinct description of the patterns in a sequence database. We present and describe a sequence pattern structure model in which rare closed subsequences can be easily encoded. We also propose a discussion and characterization of the search space of closed sequences and, through the notion of sequence alignments, provide an intuitive implementation of a similarity operator for the sequence pattern structure based on directed acyclic graphs. Finally, we provide an experimental evaluation of our approach in comparison with state-of-the-art closed sequence mining algorithms showing that our approach can largely outperform them when dealing with large regions of the search space.
引用
收藏
页码:106 / 121
页数:16
相关论文
共 50 条
  • [21] NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints
    Wu, Youxi
    Tong, Yao
    Zhu, Xingquan
    Wu, Xindong
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (10) : 2809 - 2822
  • [22] Spatial-Temporal Moving Sequence Pattern Mining
    Han, Seon-Young
    Yong, Hwan-Seung
    KOREAN JOURNAL OF APPLIED STATISTICS, 2006, 19 (03) : 599 - 617
  • [23] An Estimation of Distribution Algorithms Applied to Sequence Pattern Mining
    Godinho, Paulo Igor A.
    Goncalves Meiguins, Aruanda S.
    Limao de Oliveira, Roberto C.
    Meiguins, Bianchi S.
    INNOVATIONS IN COMPUTING SCIENCES AND SOFTWARE ENGINEERING, 2010, : 589 - 593
  • [24] Sequential Pattern Mining with Inaccurate Event in Temporal Sequence
    Ren, Jiadong
    Tian, Haiyan
    NCM 2008: 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 2, PROCEEDINGS, 2008, : 659 - 664
  • [25] The Apriori Property of Sequence Pattern Mining with Wildcard Gaps
    Min, Fan
    Wu, Youxi
    Wu, Xindong
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2010, : 138 - 143
  • [26] An Alert Correlation Algorithm Based on the Sequence Pattern Mining
    Lv, Yanli
    Li, Yuanlong
    Xiang, Shuang
    Xia, Chunhe
    Geng, Jingxin
    2015 IEEE ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2015, : 1146 - 1151
  • [27] High utility pattern mining using the maximal itemset property and lexicographic tree structures
    Lin, Ming-Yen
    Tu, Tzer-Fu
    Hsueh, Sue-Chen
    INFORMATION SCIENCES, 2012, 215 : 1 - 14
  • [28] Efficient STMPM(Spatio-Temporal Moving Pattern Mining) Using Moving Sequence Tree
    Lee, YonSik
    Ko, Hyun
    NCM 2008: 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 2, PROCEEDINGS, 2008, : 432 - 437
  • [29] Using API Calls for Sequence-Pattern Feature Mining-Based Malware Detection
    Balan, Gheorghe
    Gavrilut, Dragos Teodor
    Luchian, Henri
    INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2022, 2022, 13620 : 233 - 251
  • [30] Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags
    Xu, Y
    Mural, RJ
    Uberbacher, EC
    ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, 1997, : 344 - 353