Approximate mining of maximal frequent itemsets in data streams with different window models

被引:10
|
作者
Li, Hua-Fu [1 ]
Lee, Suh-Yin [2 ]
机构
[1] Kainan Univ, Dept Comp Sci, Tao Yuan 338, Taiwan
[2] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 300, Taiwan
关键词
data mining; data streams; maximal frequent itemsets; one-pass mining; approximate mining;
D O I
10.1016/j.eswa.2007.07.046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A data stream is a massive, open-ended sequence of data elements continuously generated at a rapid rate. Mining data streams is more difficult than mining static databases because the huge, high-speed and continuous characteristics of streaming data. In this paper, we propose a new one-pass algorithm called DSM-MFI (stands for Data Stream Mining for Maximal Frequent Itemsets), which mines the set of all maximal frequent itemsets in landmark windows over data streams. A new summary data structure called summary frequent itemset forest (abbreviated as SFI-forest) is developed for incremental maintaining the essential information about maximal frequent itemsets embedded in the stream so far. Theoretical analysis and experimental studies show that the proposed algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of the data streams. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:781 / 789
页数:9
相关论文
共 50 条
  • [1] Mining maximal frequent itemsets in a sliding window over data streams
    Mao Y.
    Li H.
    Yang L.
    Liu L.
    Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (11): : 1142 - 1148
  • [2] Mining Recent Maximal Frequent Itemsets Over Data Streams with Sliding Window
    Cai, Saihua
    Hao, Shangbo
    Sun, Ruizhi
    Wu, Gang
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2019, 16 (06) : 961 - 969
  • [3] Mining maximal frequent itemsets from data streams
    Mao, Guojun
    Wu, Xindong
    Zhu, Xingquan
    Chen, Gong
    Liu, Chunnian
    JOURNAL OF INFORMATION SCIENCE, 2007, 33 (03) : 251 - 262
  • [4] Mining Approximate Frequent Itemsets over Data Streams Using Window Sliding Techniques
    Kim, Younghee
    Park, Eunkyoung
    Kim, Ungmo
    DATABASE THEORY AND APPLICATION, 2009, 64 : 49 - 56
  • [5] Online mining (recently) maximal frequent itemsets over data streams
    Li, HF
    Lee, SY
    Shan, MK
    15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, Proceedings, 2005, : 11 - 18
  • [6] A Mining Maximal Frequent Itemsets over the Entire History of Data Streams
    Mao, Yinmin
    Li, Hong
    Yang, Lumin
    Chen, Zhigang
    Liu, Lixin
    FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 413 - 417
  • [7] An efficient algorithm for mining maximal frequent itemsets over data streams
    Mao Y.
    Yang L.
    Li H.
    Chen Z.
    Liu L.
    Gaojishu Tongxin/Chinese High Technology Letters, 2010, 20 (03): : 246 - 252
  • [8] Online mining (recently) maximal frequent itemsets over data streams
    Li, H.-F. (hfli@csie.nctu.edu.tw), IEEE Computer Society Tech. Committee on Data Eng., TCDE (Institute of Electrical and Electronics Engineers Computer Society):
  • [9] Approximate mining of global closed frequent itemsets over data streams
    Guo, Lichao
    Su, Hongye
    Qu, Yu
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2011, 348 (06): : 1052 - 1081
  • [10] An approximate approach for mining recently frequent itemsets from data streams
    Koh, Jia-Ling
    Shin, Shu-Ning
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 352 - 362