Approximate mining of maximal frequent itemsets in data streams with different window models

被引:10
|
作者
Li, Hua-Fu [1 ]
Lee, Suh-Yin [2 ]
机构
[1] Kainan Univ, Dept Comp Sci, Tao Yuan 338, Taiwan
[2] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 300, Taiwan
关键词
data mining; data streams; maximal frequent itemsets; one-pass mining; approximate mining;
D O I
10.1016/j.eswa.2007.07.046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data stream is a massive, open-ended sequence of data elements continuously generated at a rapid rate. Mining data streams is more difficult than mining static databases because the huge, high-speed and continuous characteristics of streaming data. In this paper, we propose a new one-pass algorithm called DSM-MFI (stands for Data Stream Mining for Maximal Frequent Itemsets), which mines the set of all maximal frequent itemsets in landmark windows over data streams. A new summary data structure called summary frequent itemset forest (abbreviated as SFI-forest) is developed for incremental maintaining the essential information about maximal frequent itemsets embedded in the stream so far. Theoretical analysis and experimental studies show that the proposed algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of the data streams. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:781 / 789
页数:9
相关论文
共 50 条
  • [41] Frequent Itemsets Mining in Data Streams Using Reconfigurable Hardware
    Bustio, Lazaro
    Cumplido, Rene
    Hernandez, Raudel
    Bande, Jose M.
    Feregrino, Claudia
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, 2016, 9607 : 32 - 45
  • [42] Efficient mining algorithm of frequent itemsets for uncertain data streams
    Wang Qianqian
    Liu Fang-ai
    PROCEEDINGS OF 2016 9TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2016, : 443 - 446
  • [43] Mining of Probabilistic Frequent Itemsets over Uncertain Data Streams
    Liu Lixin
    Zhang Xiaolin
    Zhang Huanxiang
    2014 11TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2014, : 231 - 237
  • [44] A survey on algorithms for mining frequent itemsets over data streams
    James Cheng
    Yiping Ke
    Wilfred Ng
    Knowledge and Information Systems, 2008, 16 : 1 - 27
  • [45] An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Ao, Fujiang
    Du, Jing
    Yan, Yuejin
    Liu, Baohong
    Huang, Kedi
    8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, : 37 - +
  • [46] Mining Frequent Itemsets with Normalized Weight in Continuous Data Streams
    Kim, Younghee
    Kim, Wonyoung
    Kim, Ungmo
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2010, 6 (01): : 79 - 90
  • [47] Uncertain Frequent Itemsets Mining Algorithm on Data Streams with Constraints
    Yu, Qun
    Tang, Ke-Ming
    Tang, Shi-Xi
    Lv, Xin
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 192 - 201
  • [48] Sliding window based weighted maximal frequent pattern mining over data streams
    Lee, Gangin
    Yun, Unil
    Ryu, Keun Ho
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (02) : 694 - 708
  • [49] Distributed Mining of Maximal Frequent Itemsets on a Data Grid System
    Congnan Luo
    Anil L. Pereira
    Soon M. Chung
    The Journal of Supercomputing, 2006, 37 : 71 - 90
  • [50] New Policy of Maximal Frequent Itemsets in Data Stream Mining
    Xu, ChongHuan
    Ju, ChunHua
    ADVANCED MECHANICAL ENGINEERING, PTS 1 AND 2, 2010, 26-28 : 118 - +