Efficient weighted sequential pattern mining

被引:1
|
作者
Chen, Shaotao [1 ]
Chen, Jiahui [1 ]
Wan, Shicheng [2 ]
机构
[1] Guangdong Univ Technol, Dept Comp Sci, Guangzhou 510006, Peoples R China
[2] South China Univ Technol, Sch Business Adm, Guangzhou 510641, Peoples R China
基金
中国国家自然科学基金;
关键词
Data mining; Sequential pattern mining; Remaining item; Weighted sequence; Tighter upper bound;
D O I
10.1016/j.eswa.2023.122703
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real-life applications, data mining task involves extracting valuable but hidden information from massive data. How to effectively find out interesting patterns from large databases is a current topic. Sequential pattern mining is the most popular approach in data mining domain. Traditional sequential pattern mining research generally focuses on discovering frequent sequential patterns. However, the account of occurrence times of patterns does not adequately indicate their importance. For instance, frequent patterns (e.g., pencil and eraser) are not profitable, whereas infrequent patterns (e.g., extreme weather) are high-risk. To extract more useful information, researchers study a weighted sequential pattern mining task. In this paper, an efficient algorithm for weighted sequential pattern mining task, called EWSPM, is proposed. Two new strict upper bounds, namely MWEbound and MSRIWbound , are designed based on the concepts of maximum weight estimation (simplified as MWE) and maximum sumation of remaining item weights (simplified as MSRIW), respectively. These upper bounds achieve better pruning effects and reduce the size of search space during the mining process, which significantly shortens execution time. In addition, a database-projection method is employed to optimize memory usage. It addresses potential memory explosion issues in a certain degree. Finally, we also conducted extensive experiments on nine datasets (including real and synthetic). The experimental results demonstrate that the EWSPM algorithm is capable of mining all interesting patterns efficiently, with the smallest size of search space. Additionally, the novel algorithm also exhibits superior performance in terms of execution time and memory consumption.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] WSpCPs: Weighted Sequential Pattern Mining based on Cluster-Pruning Mechanism
    Fu, Yu
    Yu, Yanhua
    Song, Meina
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PROBLEM-SOLVING (ICCP), 2013, : 291 - 294
  • [22] WIS: Weighted interesting sequential pattern mining with a similar level of support and/or weight
    Yun, Unil
    [J]. ETRI JOURNAL, 2007, 29 (03) : 336 - 352
  • [23] CCSMP: an efficient closed contiguous sequential pattern mining algorithm with a pattern relation graph
    Hu, Haichuan
    Zhang, Jingwei
    Xia, Ruiqing
    Liu, Shichao
    [J]. APPLIED INTELLIGENCE, 2023, 53 (24) : 29723 - 29740
  • [24] CCSMP: an efficient closed contiguous sequential pattern mining algorithm with a pattern relation graph
    Haichuan Hu
    Jingwei Zhang
    Ruiqing Xia
    Shichao Liu
    [J]. Applied Intelligence, 2023, 53 : 29723 - 29740
  • [25] Efficient approach for incremental weighted erasable pattern mining with list structure
    Nam, Hyoju
    Yun, Unil
    Yoon, Eunchul
    Lin, Jerry Chun-Wei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 143
  • [26] On Selection of Efficient Sequential Pattern Mining Algorithm Based on Characteristics of Data
    Peschel, Jakub
    Batko, Michal
    Zezula, Pavel
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2022, : 202 - 205
  • [27] Combining clustering with moving sequential pattern mining: A novel and efficient technique
    Ma, SA
    Tang, SW
    Yang, DQ
    Wang, TJ
    Han, JQ
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2004, 3056 : 419 - 423
  • [28] Efficient Chain Structure for High-Utility Sequential Pattern Mining
    Lin, Jerry Chun-Wei
    Li, Yuanfa
    Fournier-Viger, Philippe
    Djenouri, Youcef
    Zhang, Ji
    [J]. IEEE ACCESS, 2020, 8 : 40714 - 40722
  • [29] Efficient mining of sequential patterns with time constraints by delimited pattern growth
    Ming-Yen Lin
    Suh-Yin Lee
    [J]. Knowledge and Information Systems, 2005, 7 : 499 - 514
  • [30] Efficient mining of sequential patterns with time constraints by delimited pattern growth
    Lin, MY
    Lee, SY
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (04) : 499 - 514