Efficient strategies for incremental mining of frequent closed itemsets over data streams

被引:8
|
作者
Liu, Junqiang [1 ,2 ]
Ye, Zhousheng [1 ]
Yang, Xiangcai [1 ]
Wang, Xueling [1 ]
Shen, Linjie [2 ]
Jiang, Xiaoning [1 ,2 ]
机构
[1] Zhejiang Gongshang Univ, Sch Informat & Elect Engn, Hangzhou 310018, Peoples R China
[2] Zhejiang Gongshang Univ, Sussex Artificial Intelligence Inst, Hangzhou 310018, Peoples R China
基金
中国国家自然科学基金;
关键词
Data streams; Closed itemsets; Frequent itemsets; Data mining; Knowledge discovery; HIGH-UTILITY ITEMSETS; ALGORITHM; PATTERNS;
D O I
10.1016/j.eswa.2021.116220
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining frequent closed itemsets over data streams is an important data mining problem. Mining data streams is more challenging than mining static data because of the nature of data streams, including high arrival rate, massive volume of incoming data, and concept drift. The existing algorithms for mining frequent closed itemsets over data streams suffer from scalability and efficiency bottlenecks. This paper proposes a novel algorithm for mining frequent closed itemsets over data streams both for the sliding window model and for the landmark model. An indexed prefix closed itemset tree is proposed for compressing all closed itemsets and for quick searching of closed itemsets, and novel search strategies are proposed to prune the search space in updating the set of closed itemsets. The proposed algorithm outperforms the state-of-the-art intersection-based algorithms, CICLAD, ConPatSet, and CloStream, by several times to 2 orders of magnitude in efficiency, and also outperforms the state-of-the-art pattern enumeration algorithm, Moment, by up to 2 orders of magnitude over data streams with large windows and sparse data streams. The proposed algorithm is also superior in scalability.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Bu, Yingyong
    Yang, Bo
    [J]. 2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 3, PROCEEDINGS, 2009, : 65 - +
  • [2] An Efficient Frequent Closed Itemsets Mining Algorithm Over Data Streams
    Tan, Jun
    Yu, Shao-jun
    [J]. 2011 SECOND INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND EDUCATION APPLICATION (ICEA 2011), 2011, : 197 - 201
  • [3] An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Ao, Fujiang
    Du, Jing
    Yan, Yuejin
    Liu, Baohong
    Huang, Kedi
    [J]. 8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, : 37 - +
  • [4] Incremental updates of closed frequent itemsets over continuous data streams
    Li, Hlia-Fu
    Ho, Chin-Chuan
    Lee, Suh-Yin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 2451 - 2458
  • [5] Efficient Data Streams Based Closed Frequent Itemsets Mining Algorithm
    Tan, Jun
    [J]. ADVANCES IN CIVIL ENGINEERING II, PTS 1-4, 2013, 256-259 : 2910 - 2913
  • [6] Approximate mining of global closed frequent itemsets over data streams
    Guo, Lichao
    Su, Hongye
    Qu, Yu
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2011, 348 (06): : 1052 - 1081
  • [7] Fast Mining of Closed Frequent Itemsets in Data Streams
    Mao Yimin
    Chen Zhigang
    Liu Lixin
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 231 - +
  • [8] Closed Frequent Itemsets mining over Data streams for Visualizing Network Traffic
    Jeyasutha, M.
    Dhanaseelan, F. Ramesh
    [J]. 2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015), 2015,
  • [9] An Efficient Algorithm for Mining Frequent Closed Itemsets over Data Stream
    Li Guodong
    Xia Kewen
    [J]. NEW TRENDS IN MECHATRONICS AND MATERIALS ENGINEERING, 2012, 151 : 570 - 575
  • [10] An efficient approach to mining frequent itemsets on data streams
    Ansari, Sara
    Sadreddini, Mohammad Hadi
    [J]. World Academy of Science, Engineering and Technology, 2009, 37 : 489 - 495