A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams

被引:0
|
作者
Eugenio Cesario
Carlo Mastroianni
Domenico Talia
机构
[1] ICAR-CNR,ICAR
[2] University of Calabria,CNR and DIMES
来源
Journal of Grid Computing | 2014年 / 12卷
关键词
Distributed data mining; Frequent items; Frequent itemsets; Grid; Stream mining;
D O I
暂无
中图分类号
学科分类号
摘要
Real-time analysis of distributed data streams is a challenging task since it requires scalable solutions to handle streams of data that are generated very rapidly by multiple sources. This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. In particular, data stream analysis has been carried out for the computation of items and itemsets that exceed a frequency threshold. The mining approach is hybrid, that is, frequent items are calculated with a single pass, using a sketch algorithm, while frequent itemsets are calculated by a further multi-pass analysis. The architecture combines parallel and distributed processing to keep the pace with the rate of distributed data streams. In order to keep computation close to data, miners are distributed among the domains where data streams are generated. The paper reports the experimental results obtained with a prototype of the architecture, tested on a Grid composed of three domains each one handling a data stream.
引用
收藏
页码:153 / 168
页数:15
相关论文
共 50 条
  • [1] A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams
    Cesario, Eugenio
    Mastroianni, Carlo
    Talia, Domenico
    [J]. JOURNAL OF GRID COMPUTING, 2014, 12 (01) : 153 - 168
  • [2] Mining frequent items and itemsets from distributed data streams for emergency detection and management
    Altomare, Albino
    Cesario, Eugenio
    Talia, Domenico
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2017, 8 (01) : 47 - 55
  • [3] Mining frequent items and itemsets from distributed data streams for emergency detection and management
    Albino Altomare
    Eugenio Cesario
    Domenico Talia
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2017, 8 : 47 - 55
  • [4] Mining maximal frequent itemsets from data streams
    Mao, Guojun
    Wu, Xindong
    Zhu, Xingquan
    Chen, Gong
    Liu, Chunnian
    [J]. JOURNAL OF INFORMATION SCIENCE, 2007, 33 (03) : 251 - 262
  • [5] Efficient mining of frequent itemsets from data streams
    Leung, Carson Kai-Sang
    Brajczuk, Dale A.
    [J]. SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 2 - 14
  • [6] Mining of Frequent Itemsets from Streams of Uncertain Data
    Leung, Carson Kai-Sang
    Hao, Boyu
    [J]. ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 1663 - 1670
  • [7] MFIS - Mining frequent itemsets on data streams
    Xie, Zhi-jun
    Chen, Hong
    Li, Cuiping
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 1085 - 1093
  • [8] Mining Recent Frequent Itemsets in Data Streams
    Li, Kun
    Wang, Yong-yan
    Ellahi, Manzoor
    Wang, Hong-an
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2008, : 353 - 358
  • [9] Fast Mining of Closed Frequent Itemsets in Data Streams
    Mao Yimin
    Chen Zhigang
    Liu Lixin
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 231 - +
  • [10] An efficient approach to mining frequent itemsets on data streams
    Ansari, Sara
    Sadreddini, Mohammad Hadi
    [J]. World Academy of Science, Engineering and Technology, 2009, 37 : 489 - 495