Data Stream Clustering with Affinity Propagation

被引:65
|
作者
Zhang, Xiangliang [1 ]
Furtlehner, Cyril [2 ]
Germain-Renaud, Cecile [2 ]
Sebag, Michele [2 ]
机构
[1] KAUST, Thuwal 239556900, Saudi Arabia
[2] Univ Paris 11, CNRS, TAO INRIA, F-91405 Orsay, France
关键词
Streaming data clustering; affinity propagation; grid monitoring; autonomic computing; SETS;
D O I
10.1109/TKDE.2013.146
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data stream clustering provides insights into the underlying patterns of data flows. This paper focuses on selecting the best representatives from clusters of streaming data. There are two main challenges: how to cluster with the best representatives and how to handle the evolving patterns that are important characteristics of streaming data with dynamic distributions. We employ the Affinity Propagation (AP) algorithm presented in 2007 by Frey and Dueck for the first challenge, as it offers good guarantees of clustering optimality for selecting exemplars. The second challenging problem is solved by change detection. The presented STRAP algorithm combines AP with a statistical change point detection test; the clustering model is rebuilt whenever the test detects a change in the underlying data distribution. Besides the validation on two benchmark data sets, the presented algorithm is validated on a real-world application, monitoring the data flow of jobs submitted to the EGEE grid.
引用
收藏
页码:1644 / 1656
页数:13
相关论文
共 50 条
  • [1] Data Stream Clustering Algorithm Based on Affinity Propagation and Density
    Li Yang
    Tan Baihong
    [J]. MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 444 - 449
  • [2] Affinity Propagation Clustering with Incomplete Data
    Lu, Cheng
    Song, Shiji
    Wu, Cheng
    [J]. COMPUTATIONAL INTELLIGENCE, NETWORKED SYSTEMS AND THEIR APPLICATIONS, 2014, 462 : 239 - 248
  • [3] Online Stream Clustering using Density and Affinity Propagation Algorithm
    Zhang, Jian-Peng
    Chen, Fu-Cai
    Liu, Li-Xiong
    Li, Shao-Mei
    [J]. PROCEEDINGS OF 2013 IEEE 4TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2012, : 828 - 832
  • [4] Clustering of fMRI Data Using Affinity Propagation
    Liu, Dazhong
    Lu, Wanxuan
    Zhong, Ning
    [J]. BRAIN INFORMATICS, BI 2010, 2010, 6334 : 399 - 406
  • [5] Active clustering data streams with affinity propagation
    Abdulah, Sameh
    Atwa, Walid
    Abdelmoniem, Ahmed M.
    [J]. ICT EXPRESS, 2022, 8 (02): : 276 - 282
  • [6] RAPSAMS: Robust affinity propagation clustering on static android malware stream
    Katebi, Matin
    RezaKhani, Afshin
    Joudaki, Saba
    Shiri, Mohammad Ebrahim
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (15):
  • [7] Performance Evaluation of Affinity Propagation Approaches on Data Clustering
    Refianti, R.
    Mutiara, A. B.
    Syamsudduha, A. A.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (03) : 420 - 429
  • [8] A Parallel Affinity Propagation Clustering Algorithm in Biological Data Processing
    Wang, Minchao
    Zhang, Wu
    Dai, Dongbo
    Zhang, Huiran
    Xie, Jiang
    [J]. 2014 INTERNATIONAL CONFERENCE ON BIOLOGICAL ENGINEERING AND BIOMEDICAL (BEAB 2014), 2014, : 248 - 254
  • [9] A New Similarity Measure Based Affinity Propagation for Data Clustering
    Akash, O. M.
    Ahmad, Sharifah Sakinah Binti Syed
    Bin Azmi, Mohd Sanusi
    [J]. ADVANCED SCIENCE LETTERS, 2018, 24 (02) : 1130 - 1133
  • [10] Analysis of activity in fMRI data using affinity propagation clustering
    Zhang, Jiang
    Li, Dahuan
    Chen, Huafu
    Fang, Fang
    [J]. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING, 2011, 14 (03) : 271 - 281