An Efficient Sliding Window Based Algorithm for Adaptive Frequent Itemset Mining over Data Streams

被引:0
|
作者
Deypir, Mhmood [1 ]
Sadreddini, Mohammad Hadi [1 ]
Taahomi, Mehran [1 ]
机构
[1] Shiraz Univ, Sch Engn, Dept Comp Sci & Engn, Shiraz, Iran
关键词
frequent itemset mining; sliding window; data stream; data mining; prefix tree; PATTERNS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mining frequent itemsets over high speed, continuous and infinite data streams is a challenging problem due to changing nature of data and limited memory and processing capacities of computing systems. Sliding window is an interesting model to solve this problem since it does not need the entire history of received transactions and can handle concept change by considering only a limited range of recent transactions. However, previous sliding window algorithms require a large amount of memory and processing time. This paper, introduces a new algorithm based on a prefix tree data structure to find and update frequent itemsets of the window. In order to enhance the performance, instead of a single transaction, a batch of transactions is used as the unit of insertion and deletion within the window. Moreover, by using an effective traversal strategy for the prefix tree and suitable representation for each batch of transactions, both updating of current itemsets and inserting of newly emerged itemsets are performed together, thus improving the performance even further. Additionally, in the proposed algorithm by storing required information in each node of the prefix tree, deleting old batch of transactions from the window as well as pruning infrequent itemsets are efficiently accomplished. Although, with respect to previous algorithms, our algorithm maintains more information in the prefix tree, it does not require storing the set of transactions of the window, thus reducing the memory usage. Empirical evaluations on both real and synthetic datasets show the superiority of the proposed algorithm in terms of runtime and memory requirement. Moreover, it produces mining results with higher quality.
引用
收藏
页码:1001 / 1020
页数:20
相关论文
共 50 条
  • [1] A sliding window based algorithm for frequent closed itemset mining over data streams
    Nori, Fatemeh
    Deypir, Mahmood
    Sadreddini, Mohamad Hadi
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (03) : 615 - 623
  • [2] A New Sliding Window Based Algorithm for Frequent Closed Itemset Mining Over Data Streams
    Nori, Fatemeh
    Deypir, Mahmood
    Sadreddini, Mohamad Hadi
    Ziarati, Korosh
    [J]. 2011 1ST INTERNATIONAL ECONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2011, : 249 - 253
  • [3] An Efficient Algorithm for Mining Frequent Item over Data Streams Based on Sliding Window
    Kuang Zhufang
    Yang Guogui
    Xin Dongjun
    [J]. ICCSE 2008: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION: ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, 2008, : 613 - 618
  • [4] A dynamic layout of sliding window for frequent itemset mining over data streams
    Deypir, Mahmood
    Sadreddini, Mohammad Hadi
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2012, 85 (03) : 746 - 759
  • [5] An efficient algorithm for frequent itemset mining on data streams
    Xie Zhi-Jun
    Chen Hong
    Li, Cuiping
    [J]. ADVANCES IN DATA MINING: APPLICATIONS IN MEDICINE, WEB MINING, MARKETING, IMAGE AND SIGNAL MINING, 2006, 4065 : 474 - 491
  • [6] An Efficient Algorithm for Sliding Window-Based Weighted Frequent Pattern Mining over Data Streams
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (07): : 1369 - 1381
  • [7] Towards a variable size sliding window model for frequent itemset mining over data streams
    Deypir, Mahmood
    Sadreddini, Mohammad Hadi
    Hashemi, Sattar
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2012, 63 (01) : 161 - 172
  • [8] A frequent itemsets mining algorithm based on matrix in sliding window over data streams
    Fan Guidan
    Yin Shaohong
    [J]. 2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA), 2013, : 66 - 69
  • [9] Frequent pattern mining algorithm for uncertain data streams based on sliding window
    Yang, Junrui
    Yang, Cai
    Wei, Yanjun
    [J]. 2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2, 2016, : 265 - 268
  • [10] Sliding window-based frequent pattern mining over data streams
    Tanbeer, Syed Khairuzzaman
    Ahmed, Chowdhury Farhan
    Jeong, Byeong-Soo
    Lee, Young-Koo
    [J]. INFORMATION SCIENCES, 2009, 179 (22) : 3843 - 3865