A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams

被引:8
|
作者
Cesario, Eugenio [1 ]
Mastroianni, Carlo [1 ]
Talia, Domenico [2 ,3 ]
机构
[1] ICAR CNR, I-87036 Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, ICAR CNR, I-87036 Arcavacata Di Rende, CS, Italy
[3] Univ Calabria, DIMES, I-87036 Arcavacata Di Rende, CS, Italy
关键词
Distributed data mining; Frequent items; Frequent itemsets; Grid; Stream mining;
D O I
10.1007/s10723-013-9277-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real-time analysis of distributed data streams is a challenging task since it requires scalable solutions to handle streams of data that are generated very rapidly by multiple sources. This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. In particular, data stream analysis has been carried out for the computation of items and itemsets that exceed a frequency threshold. The mining approach is hybrid, that is, frequent items are calculated with a single pass, using a sketch algorithm, while frequent itemsets are calculated by a further multi-pass analysis. The architecture combines parallel and distributed processing to keep the pace with the rate of distributed data streams. In order to keep computation close to data, miners are distributed among the domains where data streams are generated. The paper reports the experimental results obtained with a prototype of the architecture, tested on a Grid composed of three domains each one handling a data stream.
引用
收藏
页码:153 / 168
页数:16
相关论文
共 50 条
  • [1] A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams
    Eugenio Cesario
    Carlo Mastroianni
    Domenico Talia
    Journal of Grid Computing, 2014, 12 : 153 - 168
  • [2] Mining frequent items and itemsets from distributed data streams for emergency detection and management
    Altomare, Albino
    Cesario, Eugenio
    Talia, Domenico
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (01) : 47 - 55
  • [3] Mining frequent items and itemsets from distributed data streams for emergency detection and management
    Albino Altomare
    Eugenio Cesario
    Domenico Talia
    Journal of Ambient Intelligence and Humanized Computing, 2017, 8 : 47 - 55
  • [4] Mining maximal frequent itemsets from data streams
    Mao, Guojun
    Wu, Xindong
    Zhu, Xingquan
    Chen, Gong
    Liu, Chunnian
    JOURNAL OF INFORMATION SCIENCE, 2007, 33 (03) : 251 - 262
  • [5] Mining of Frequent Itemsets from Streams of Uncertain Data
    Leung, Carson Kai-Sang
    Hao, Boyu
    ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1663 - 1670
  • [6] Efficient mining of frequent itemsets from data streams
    Leung, Carson Kai-Sang
    Brajczuk, Dale A.
    SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 2 - 14
  • [7] MFIS - Mining frequent itemsets on data streams
    Xie, Zhi-jun
    Chen, Hong
    Li, Cuiping
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 1085 - 1093
  • [8] Mining Recent Frequent Itemsets in Data Streams
    Li, Kun
    Wang, Yong-yan
    Ellahi, Manzoor
    Wang, Hong-an
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2008, : 353 - 358
  • [9] Fast Mining of Closed Frequent Itemsets in Data Streams
    Mao Yimin
    Chen Zhigang
    Liu Lixin
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 231 - +
  • [10] An efficient approach to mining frequent itemsets on data streams
    Ansari, Sara
    Sadreddini, Mohammad Hadi
    World Academy of Science, Engineering and Technology, 2009, 37 : 489 - 495