CLUSTERING LARGE-SCALE DATA BASED ON MODIFIED AFFINITY PROPAGATION ALGORITHM

被引:28
|
作者
Serdah, Ahmed M. [1 ]
Ashour, Wesam M. [1 ]
机构
[1] Islamic Univ Gaza, Comp Engn Dept, Gaza 108, Palestine
关键词
D O I
10.1515/jaiscr-2016-0003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional clustering algorithms are no longer suitable for use in data mining applications that make use of large-scale data. There have been many large-scale data clustering algorithms proposed in recent years, but most of them do not achieve clustering with high quality. Despite that Affinity Propagation (AP) is effective and accurate in normal data clustering, but it is not effective for large-scale data. This paper proposes two methods for large-scale data clustering that depend on a modified version of AP algorithm. The proposed methods are set to ensure both low time complexity and good accuracy of the clustering method. Firstly, a data set is divided into several subsets using one of two methods random fragmentation or K-means. Secondly, subsets are clustered into K clusters using K-Affinity Propagation (KAP) algorithm to select local cluster exemplars in each subset. Thirdly, the inverse weighted clustering algorithm is performed on all local cluster exemplars to select well-suited global exemplars of the whole data set. Finally, all the data points are clustered by the similarity between all global exemplars and each data point. Results show that the proposed clustering method can significantly reduce the clustering time and produce better clustering result in a way that is more effective and accurate than AP, KAP, and HAP algorithms.
引用
收藏
页码:23 / 33
页数:11
相关论文
共 50 条
  • [1] Affinity propagation clustering algorithm based on large-scale data-set
    Wang, Limin
    Zheng, Kaiyue
    Tao, Xing
    Han, Xuming
    [J]. International Journal of Computers and Applications, 2018, 40 (03) : 1 - 6
  • [2] An Improved Affinity Propagation Clustering Algorithm for Large-scale Data Sets
    Liu, Xiaonan
    Yin, Meijuan
    Luo, Junyong
    Chen, Wuping
    [J]. 2013 NINTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2013, : 894 - 899
  • [3] A stratified sampling based clustering algorithm for large-scale data
    Zhao, Xingwang
    Liang, Jiye
    Dang, Chuangyin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2019, 163 : 416 - 428
  • [4] Fuzzy clustering algorithm based on multiple medoids for large-scale data
    Chen, Ai-Guo
    Wang, Shi-Tong
    [J]. Kongzhi yu Juece/Control and Decision, 2016, 31 (12): : 2122 - 2130
  • [5] Local and global approaches of affinity propagation clustering for large scale data
    Ding-yin Xia
    Fei Wu
    Xu-qing Zhang
    Yue-ting Zhuang
    [J]. Journal of Zhejiang University-SCIENCE A, 2008, 9 : 1373 - 1381
  • [7] Local and global approaches of affinity propagation clustering for large scale data
    Xia, Ding-yin
    Wu, Fei
    Zhang, Xu-qing
    Zhuang, Yue-ting
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2008, 9 (10): : 1373 - 1381
  • [8] A Local Approach of Adaptive Affinity Propagation Clustering for Large Scale Data
    Sun, Changyin
    Wang, Chenghong
    Song, Su
    Wang, Yifan
    [J]. IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 161 - +
  • [9] A Novel Clustering Algorithm on Large-Scale Graph Data
    Zhang, Hao
    Zhou, Wei
    Wan, Xiaoyu
    Fu, Ge
    Xu, Zhiyong
    Han, Jizhong
    [J]. 2014 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2014, : 47 - 54
  • [10] Data Stream Clustering Algorithm Based on Affinity Propagation and Density
    Li Yang
    Tan Baihong
    [J]. MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 444 - 449