Fast affinity propagation clustering based on incomplete similarity matrix

被引:25
|
作者
Sun, Leilei [1 ]
Guo, Chonghui [1 ]
Liu, Chuanren [2 ]
Xiong, Hui [3 ]
机构
[1] Dalian Univ Technol, Inst Syst Engn, Dalian, Liaoning, Peoples R China
[2] Drexel Univ, Decis Sci & MIS Dept, Philadelphia, PA 19104 USA
[3] Rutgers State Univ, Management Sci & Informat Syst Dept, Newark, NJ USA
基金
中国国家自然科学基金;
关键词
Exemplar-based clustering; Affinity propagation; Incomplete similarity matrix; Fast algorithm; DIMENSIONALITY REDUCTION;
D O I
10.1007/s10115-016-0996-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Affinity propagation (AP) is a recently proposed clustering algorithm, which has been successful used in a lot of practical problems. Although effective in finding meaningful clustering solutions, a key disadvantage of AP is its efficiency, which has become the bottleneck when applying AP for large-scale problems. In the literature, most of the methods proposed to improve the efficiency of AP are based on implementing the message-passing on a sparse similarity matrix, while neither the decline in effectiveness nor the improvement in efficiency is theoretically analyzed. In this paper, we propose a two-stage fast affinity propagation (FastAP) algorithm. Different from previous work, the scale of the similarity matrix is first compressed by selecting only potential exemplars, then further reduced by sparseness according to k nearest neighbors. More importantly, we provide theoretical analysis, based on which the improvement of efficiency in our method is controllable with guaranteed clustering performance. In experiments, two synthetic data sets, seven publicly available data sets, and two real-world streaming data sets are used to evaluate the proposed method. The results demonstrate that FastAP can achieve comparable clustering performances with the original AP algorithm, while the computational efficiency has been improved with a several-fold speed-up on small data sets and a dozens-of-fold on larger-scale data sets.
引用
收藏
页码:941 / 963
页数:23
相关论文
共 50 条
  • [1] Fast affinity propagation clustering based on incomplete similarity matrix
    Leilei Sun
    Chonghui Guo
    Chuanren Liu
    Hui Xiong
    [J]. Knowledge and Information Systems, 2017, 51 : 941 - 963
  • [2] Affinity Propagation Clustering Using Path Based Similarity
    Jiang, Yuan
    Liao, Yuliang
    Yu, Guoxian
    [J]. ALGORITHMS, 2016, 9 (03)
  • [3] Affinity Propagation Clustering with Incomplete Data
    Lu, Cheng
    Song, Shiji
    Wu, Cheng
    [J]. COMPUTATIONAL INTELLIGENCE, NETWORKED SYSTEMS AND THEIR APPLICATIONS, 2014, 462 : 239 - 248
  • [4] A New Similarity Measure Based Affinity Propagation for Data Clustering
    Akash, O. M.
    Ahmad, Sharifah Sakinah Binti Syed
    Bin Azmi, Mohd Sanusi
    [J]. ADVANCED SCIENCE LETTERS, 2018, 24 (02) : 1130 - 1133
  • [5] Affinity propagation clustering based on variable-similarity measure
    Dong, Jun
    Wang, Suo-Ping
    Xiong, Fan-Lun
    [J]. Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2010, 32 (03): : 509 - 514
  • [6] Fast Clustering by Affinity Propagation Based on Density Peaks
    Li, Yang
    Guo, Chonghui
    Sun, Leilei
    [J]. IEEE ACCESS, 2020, 8 : 138884 - 138897
  • [7] Semisupervised Clustering for Networks Based on Fast Affinity Propagation
    Zhu, Mu
    Meng, Fanrong
    Zhou, Yong
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [8] Constructing affinity matrix in spectral clustering based on neighbor propagation
    Li, Xin-Ye
    Guo, Li-jie
    [J]. NEUROCOMPUTING, 2012, 97 : 125 - 130
  • [9] Fast affinity propagation clustering: A multilevel approach
    Shang, Fanhua
    Jiao, L. C.
    Shi, Jiarong
    Wang, Fei
    Gong, Maoguo
    [J]. PATTERN RECOGNITION, 2012, 45 (01) : 474 - 486
  • [10] Clustering Protein Sequences Using Affinity Propagation Based on an Improved Similarity Measure
    Yang, Fan
    Zhu, Qing-Xin
    Tang, Dong-Ming
    Zhao, Ming-Yuan
    [J]. EVOLUTIONARY BIOINFORMATICS, 2009, 5 : 137 - 146