Efficient weighted sequential pattern mining

被引:1
|
作者
Chen, Shaotao [1 ]
Chen, Jiahui [1 ]
Wan, Shicheng [2 ]
机构
[1] Guangdong Univ Technol, Dept Comp Sci, Guangzhou 510006, Peoples R China
[2] South China Univ Technol, Sch Business Adm, Guangzhou 510641, Peoples R China
基金
中国国家自然科学基金;
关键词
Data mining; Sequential pattern mining; Remaining item; Weighted sequence; Tighter upper bound;
D O I
10.1016/j.eswa.2023.122703
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real-life applications, data mining task involves extracting valuable but hidden information from massive data. How to effectively find out interesting patterns from large databases is a current topic. Sequential pattern mining is the most popular approach in data mining domain. Traditional sequential pattern mining research generally focuses on discovering frequent sequential patterns. However, the account of occurrence times of patterns does not adequately indicate their importance. For instance, frequent patterns (e.g., pencil and eraser) are not profitable, whereas infrequent patterns (e.g., extreme weather) are high-risk. To extract more useful information, researchers study a weighted sequential pattern mining task. In this paper, an efficient algorithm for weighted sequential pattern mining task, called EWSPM, is proposed. Two new strict upper bounds, namely MWEbound and MSRIWbound , are designed based on the concepts of maximum weight estimation (simplified as MWE) and maximum sumation of remaining item weights (simplified as MSRIW), respectively. These upper bounds achieve better pruning effects and reduce the size of search space during the mining process, which significantly shortens execution time. In addition, a database-projection method is employed to optimize memory usage. It addresses potential memory explosion issues in a certain degree. Finally, we also conducted extensive experiments on nine datasets (including real and synthetic). The experimental results demonstrate that the EWSPM algorithm is capable of mining all interesting patterns efficiently, with the smallest size of search space. Additionally, the novel algorithm also exhibits superior performance in terms of execution time and memory consumption.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Weighted frequent sequential pattern mining
    Islam, Md Ashraful
    Rafi, Mahfuzur Rahman
    Azad, Al-amin
    Ovi, Jesan Ahammed
    [J]. APPLIED INTELLIGENCE, 2022, 52 (01) : 254 - 281
  • [2] Weighted frequent sequential pattern mining
    Md Ashraful Islam
    Mahfuzur Rahman Rafi
    Al-amin Azad
    Jesan Ahammed Ovi
    [J]. Applied Intelligence, 2022, 52 : 254 - 281
  • [3] Fast Weighted Sequential Pattern Mining
    Ye, Zhenqiang
    Li, Ziyang
    Guo, Weibin
    Gan, Wensheng
    Wan, Shicheng
    Chen, Jiahui
    [J]. ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND PRACTICES IN ARTIFICIAL INTELLIGENCE, 2022, 13343 : 807 - 818
  • [4] An Efficient Approach for Mining Sequential Pattern
    Pant, Nidhi
    Kant, Surya
    Pant, Bhaskar
    Sharma, Shashi Kumar
    [J]. PROCEEDINGS OF FIFTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2015), VOL 2, 2016, 437 : 587 - 596
  • [5] Efficient sequential pattern mining algorithms
    Ivancsy, Renata
    Vajk, Istvan
    [J]. WSEAS Transactions on Computers, 2005, 4 (02): : 96 - 101
  • [6] Dynamic weighted sequential pattern mining for USN system
    [J]. 1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States
  • [7] A New Algorithm for Mining Weighted Closed Sequential Pattern
    Li, Jinhong
    Yang, Bingru
    Song, Wei
    [J]. 2009 SECOND INTERNATIONAL SYMPOSIUM ON KNOWLEDGE ACQUISITION AND MODELING: KAM 2009, VOL 1, 2009, : 338 - +
  • [8] A flexible and efficient sequential pattern mining algorithm
    Lin, Jie-Ru
    Hsieh, Chia-Ying
    Yang, Don-Lin
    Wu, Jungpin
    Huang, Ming-Chuan
    [J]. International Journal of Intelligent Information and Database Systems, 2009, 3 (03) : 291 - 310
  • [9] Weighted approximate sequential pattern mining within tolerance factors
    Yun, Unil
    Ryu, Keun Ho
    Yoon, Eunchul
    [J]. INTELLIGENT DATA ANALYSIS, 2011, 15 (04) : 551 - 569
  • [10] WSpan: Weighted sequential pattern mining in large sequence databases
    Yun, Unil
    Leggett, John J.
    [J]. 2006 3rd International IEEE Conference Intelligent Systems, Vols 1 and 2, 2006, : 503 - 508