Affinity Propagation Clustering Algorithm based on Spark Platform

被引:0
|
作者
Zhang, Lijia [1 ]
Cheng, Lianglun [1 ]
机构
[1] Guangdong Univ Technol, Sch Comp Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
关键词
Affinity propagation; Resilient Distributed Datasets; Spark; Large scale dataset;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
With the explosive growing of data, there are challenges to deal with the large scale complex data. Many clustering algorithms have been proposed. Such as Affinity Propagation (AP) clustering Algorithm, AP takes similarity between pairs of data point as input measures. AP is a fast and efficient clustering algorithm for large dataset compared with the existing clustering algorithm. As the scale of data grows more explosively, the time efficiency of AP algorithm cannot be satisfied. Therefore, AP clustering algorithm based on Spark platform (Spark-AP) is proposed in this paper. Firstly, a dataset is partitioned into several Resilient Distributed Datasets (RDD) on a strategy and select the exemplars of each RDD. Then exemplars are merged and are used to next AP clustering algorithm, which forms a set of high-quality exemplars after convergence. Experiments show that Spark-AP performs better both in processing scale and processing time.
引用
收藏
页码:532 / 535
页数:4
相关论文
共 50 条
  • [31] Semi-supervised Affinity Propagation Clustering Algorithm based on Fireworks Explosion ptimization
    Wang Limin
    Han Xuming
    Ji Qiang
    2014 INTERNATIONAL CONFERENCE ON MANAGEMENT OF E-COMMERCE AND E-GOVERNMENT (ICMECG), 2014, : 273 - 279
  • [32] A novel neural network classification model based on covering and Affinity Propagation clustering algorithm
    Li, Hui
    Ding, Shifei
    Journal of Computational Information Systems, 2013, 9 (07): : 2565 - 2573
  • [33] Constraint Rules and Matching Micro-clusters Based Affinity Propagation Clustering Algorithm
    Wang, Li-min
    Zhou, You
    Han, Xu-ming
    Wang, Yi-zhang
    Yu, Jing-lin
    Wang, Shuai
    STUDIES IN INFORMATICS AND CONTROL, 2020, 29 (03): : 353 - 362
  • [34] An improved affinity propagation clustering algorithm based on principal component analysis and variation coefficient
    Han, Xuming, 1600, Inderscience Enterprises Ltd., 29, route de Pre-Bois, Case Postale 856, CH-1215 Geneva 15, CH-1215, Switzerland (07):
  • [35] ADAPTIVE SEMI-SUPERVISED AFFINITY PROPAGATION CLUSTERING ALGORITHM BASED ON STRUCTURAL SIMILARITY
    Wang, Limin
    Ji, Qiang
    Han, Xuming
    TEHNICKI VJESNIK-TECHNICAL GAZETTE, 2016, 23 (02): : 425 - 435
  • [36] A density-adaptive affinity propagation clustering algorithm based on spectral dimension reduction
    Jia, Hongjie
    Ding, Shifei
    Meng, Lingheng
    Fan, Shuyan
    NEURAL COMPUTING & APPLICATIONS, 2014, 25 (7-8): : 1557 - 1567
  • [37] A density-adaptive affinity propagation clustering algorithm based on spectral dimension reduction
    Hongjie Jia
    Shifei Ding
    Lingheng Meng
    Shuyan Fan
    Neural Computing and Applications, 2014, 25 : 1557 - 1567
  • [38] Optimal Preference Detection Based on Golden Section and Genetic Algorithm for Affinity Propagation Clustering
    Jiao, Libin
    Zhang, Guangzhi
    Wang, Shenling
    Mehmood, Rashid
    Bie, Rongfang
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, 2015, 9204 : 253 - 262
  • [39] Research on Mini-Batch Affinity Propagation Clustering Algorithm
    Xu, Ziqi
    Lu, Yahui
    Jiang, Yu
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 86 - 95
  • [40] A Parallel Affinity Propagation Clustering Algorithm in Biological Data Processing
    Wang, Minchao
    Zhang, Wu
    Dai, Dongbo
    Zhang, Huiran
    Xie, Jiang
    2014 INTERNATIONAL CONFERENCE ON BIOLOGICAL ENGINEERING AND BIOMEDICAL (BEAB 2014), 2014, : 248 - 254