Affinity Propagation Clustering Algorithm based on Spark Platform

被引:0
|
作者
Zhang, Lijia [1 ]
Cheng, Lianglun [1 ]
机构
[1] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
关键词
Affinity propagation; Resilient Distributed Datasets; Spark; Large scale dataset;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
With the explosive growing of data, there are challenges to deal with the large scale complex data. Many clustering algorithms have been proposed. Such as Affinity Propagation (AP) clustering Algorithm, AP takes similarity between pairs of data point as input measures. AP is a fast and efficient clustering algorithm for large dataset compared with the existing clustering algorithm. As the scale of data grows more explosively, the time efficiency of AP algorithm cannot be satisfied. Therefore, AP clustering algorithm based on Spark platform (Spark-AP) is proposed in this paper. Firstly, a dataset is partitioned into several Resilient Distributed Datasets (RDD) on a strategy and select the exemplars of each RDD. Then exemplars are merged and are used to next AP clustering algorithm, which forms a set of high-quality exemplars after convergence. Experiments show that Spark-AP performs better both in processing scale and processing time.
引用
收藏
页码:532 / 535
页数:4
相关论文
共 50 条
  • [41] A novel speaker clustering algorithm via supervised affinity propagation
    Zhang, Xiang
    Gao, Jie
    Lu, Ping
    Yan, Yonghong
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4369 - 4372
  • [42] Online Stream Clustering using Density and Affinity Propagation Algorithm
    Zhang, Jian-Peng
    Chen, Fu-Cai
    Liu, Li-Xiong
    Li, Shao-Mei
    PROCEEDINGS OF 2013 IEEE 4TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2012, : 828 - 832
  • [43] An Affinity Propagation Clustering Algorithm for Mixed Numeric and Categorical Datasets
    Zhang, Kang
    Gu, Xingsheng
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [44] An Incremental Affinity Propagation Algorithm and Its Applications for Text Clustering
    Shi, X. H.
    Guan, R. C.
    Wang, L. P.
    Pei, Z. L.
    Liang, Y. C.
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 2734 - 2739
  • [45] A Distributed PCM Clustering Algorithm Based on Spark
    Zhang, Yong
    Liu, Haoke
    Chen, Tianzhen
    Tang, Di
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 70 - 74
  • [46] Active Semi-supervised Affinity Propagation Clustering Algorithm based on Local Outlier Factor
    Qi, Lei
    Ting, Li
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9368 - 9373
  • [47] A Dynamic Affinity Propagation Clustering Algorithm Based on MDT in Self-Healing Heterogeneous Networks
    Ma, Yu
    Zhu, Jiajia
    Liu, Liang
    Lv, Feibi
    Wang, Yang
    2016 16TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2016, : 318 - 323
  • [48] Clustering algorithm for experimental datasets using global sensitivity-based affinity propagation (GSAP)
    Wang, Yiru
    Tao, Chenyue
    Zhou, Zijun
    Lin, Keli
    Law, Chung K.
    Yang, Bin
    COMBUSTION AND FLAME, 2024, 259
  • [49] An improved affinity propagation clustering algorithm based on entropy weight method and principal component analysis
    Limin, Wang
    Li, Zhang
    Xuming, Han
    Qiang, Ji
    Guangyu, Mu
    Ying, Liu
    International Journal of Database Theory and Application, 2016, 9 (06): : 227 - 238
  • [50] Layout optimization of soil moisture sensor in tea plantation based on affinity propagation clustering algorithm
    Zhang W.
    Zhang M.
    Hong X.
    Jiang Z.
    Jiang Y.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2019, 35 (06): : 107 - 113