Fast affinity propagation clustering based on incomplete similarity matrix

被引：25

作者：

Sun, Leilei ^{[1
]}

Guo, Chonghui ^{[1
]}

Liu, Chuanren ^{[2
]}

Xiong, Hui ^{[3
]}

机构：

[1] Dalian Univ Technol, Inst Syst Engn, Dalian, Liaoning, Peoples R China

[2] Drexel Univ, Decis Sci & MIS Dept, Philadelphia, PA 19104 USA

[3] Rutgers State Univ, Management Sci & Informat Syst Dept, Newark, NJ USA

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2017年 / 51卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Exemplar-based clustering; Affinity propagation; Incomplete similarity matrix; Fast algorithm; DIMENSIONALITY REDUCTION;

D O I：

10.1007/s10115-016-0996-y

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Affinity propagation (AP) is a recently proposed clustering algorithm, which has been successful used in a lot of practical problems. Although effective in finding meaningful clustering solutions, a key disadvantage of AP is its efficiency, which has become the bottleneck when applying AP for large-scale problems. In the literature, most of the methods proposed to improve the efficiency of AP are based on implementing the message-passing on a sparse similarity matrix, while neither the decline in effectiveness nor the improvement in efficiency is theoretically analyzed. In this paper, we propose a two-stage fast affinity propagation (FastAP) algorithm. Different from previous work, the scale of the similarity matrix is first compressed by selecting only potential exemplars, then further reduced by sparseness according to k nearest neighbors. More importantly, we provide theoretical analysis, based on which the improvement of efficiency in our method is controllable with guaranteed clustering performance. In experiments, two synthetic data sets, seven publicly available data sets, and two real-world streaming data sets are used to evaluate the proposed method. The results demonstrate that FastAP can achieve comparable clustering performances with the original AP algorithm, while the computational efficiency has been improved with a several-fold speed-up on small data sets and a dozens-of-fold on larger-scale data sets.

引用

页码：941 / 963

页数：23

共 50 条

[1] Fast affinity propagation clustering based on incomplete similarity matrix
Leilei Sun
Chonghui Guo
Chuanren Liu
Hui Xiong
[J]. Knowledge and Information Systems, 2017, 51 : 941 - 963
[2] Affinity Propagation Clustering Using Path Based Similarity
Jiang, Yuan
Liao, Yuliang
Yu, Guoxian
[J]. ALGORITHMS, 2016, 9 (03)
[3] Affinity Propagation Clustering with Incomplete Data
Lu, Cheng
Song, Shiji
Wu, Cheng
[J]. COMPUTATIONAL INTELLIGENCE, NETWORKED SYSTEMS AND THEIR APPLICATIONS, 2014, 462 : 239 - 248
[4] A New Similarity Measure Based Affinity Propagation for Data Clustering
Akash, O. M.
Ahmad, Sharifah Sakinah Binti Syed
Bin Azmi, Mohd Sanusi
[J]. ADVANCED SCIENCE LETTERS, 2018, 24 (02) : 1130 - 1133
[5] Affinity propagation clustering based on variable-similarity measure
Dong, Jun
Wang, Suo-Ping
Xiong, Fan-Lun
[J]. Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2010, 32 (03): : 509 - 514
[6] Fast Clustering by Affinity Propagation Based on Density Peaks
Li, Yang
Guo, Chonghui
Sun, Leilei
[J]. IEEE ACCESS, 2020, 8 : 138884 - 138897
[7] Semisupervised Clustering for Networks Based on Fast Affinity Propagation
Zhu, Mu
Meng, Fanrong
Zhou, Yong
[J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
[8] Constructing affinity matrix in spectral clustering based on neighbor propagation
Li, Xin-Ye
Guo, Li-jie
[J]. NEUROCOMPUTING, 2012, 97 : 125 - 130
[9] Fast affinity propagation clustering: A multilevel approach
Shang, Fanhua
Jiao, L. C.
Shi, Jiarong
Wang, Fei
Gong, Maoguo
[J]. PATTERN RECOGNITION, 2012, 45 (01) : 474 - 486
[10] Clustering Protein Sequences Using Affinity Propagation Based on an Improved Similarity Measure
Yang, Fan
Zhu, Qing-Xin
Tang, Dong-Ming
Zhao, Ming-Yuan
[J]. EVOLUTIONARY BIOINFORMATICS, 2009, 5 : 137 - 146

← 1 2 3 4 5 →