On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

被引:0
|
作者
Yang, Yingzhen [1 ]
Liang, Feng [1 ]
Yan, Shuicheng [2 ]
Wang, Zhangyang [1 ]
Huang, Thomas S. [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Natl Univ Singapore, Singapore 117576, Singapore
基金
美国国家科学基金会;
关键词
RATES; CONSISTENCY; UNIFORM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and search for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. We consider two nonparametric classifiers in this framework, i.e. the nearest neighbor classifier and the plug-in classifier. Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering. Under uniform distribution, the nonparametric similarity terms induced by both unsupervised classifiers exhibit a well known form of kernel similarity. We also prove that the generalization error bound for the unsupervised plug-in classifier is asymptotically equal to the weighted volume of cluster boundary [1] for Low Density Separation, a widely used criteria for semi-supervised learning and clustering. Based on the derived nonparametric pairwise similarity using the plug-in classifier, we propose a new nonparametric exemplar-based clustering method with enhanced discriminative capability, whose superiority is evidenced by the experimental results.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Software component clustering and classification using novel similarity measure
    Srinivas, Chintakindi
    Radhakrishna, Vangipuram
    Rao, C. V. Guru
    8TH INTERNATIONAL CONFERENCE INTERDISCIPLINARITY IN ENGINEERING, INTER-ENG 2014, 2015, 19 : 866 - 873
  • [32] Combined similarity based spectral clustering ensemble for POLSAR classification
    Liu, Lu
    Wang, Rongfang
    Jiao, Licheng
    Shi, Junfei
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2015, 42 (03): : 48 - 53
  • [33] A hierarchical laplacian TWSVM using similarity clustering for leaf classification
    Neha Goyal
    Kapil Gupta
    Cluster Computing, 2022, 25 : 1541 - 1560
  • [34] A Hierarchical SVM Based Multiclass Classification by Using Similarity Clustering
    Dong, Chao
    Zhou, Bo
    Hu, Jinglu
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [35] A hierarchical laplacian TWSVM using similarity clustering for leaf classification
    Goyal, Neha
    Gupta, Kapil
    Cluster Computing, 2022, 25 (02) : 1541 - 1560
  • [36] Neural signal classification using a simplified feature set with nonparametric clustering
    Yang, Zhi
    Zhao, Qi
    Liu, Wentai
    NEUROCOMPUTING, 2009, 73 (1-3) : 412 - 422
  • [37] A hierarchical laplacian TWSVM using similarity clustering for leaf classification
    Goyal, Neha
    Gupta, Kapil
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (02): : 1541 - 1560
  • [38] Classification of File Duplication by Hierarchical Clustering Based on Similarity Relations
    Phankokkruad, Manop
    2017 13TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2017,
  • [39] Clustering by Pattern Similarity
    Haixun Wang
    Jian Pei
    Journal of Computer Science and Technology, 2008, 23 : 481 - 496
  • [40] Clustering with similarity preserving
    Kang, Zhao
    Xu, Honghui
    Wang, Boyu
    Zhu, Hongyuan
    Xu, Zenglin
    NEUROCOMPUTING, 2019, 365 : 211 - 218