On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

被引:0
|
作者
Yang, Yingzhen [1 ]
Liang, Feng [1 ]
Yan, Shuicheng [2 ]
Wang, Zhangyang [1 ]
Huang, Thomas S. [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Natl Univ Singapore, Singapore 117576, Singapore
基金
美国国家科学基金会;
关键词
RATES; CONSISTENCY; UNIFORM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and search for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. We consider two nonparametric classifiers in this framework, i.e. the nearest neighbor classifier and the plug-in classifier. Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering. Under uniform distribution, the nonparametric similarity terms induced by both unsupervised classifiers exhibit a well known form of kernel similarity. We also prove that the generalization error bound for the unsupervised plug-in classifier is asymptotically equal to the weighted volume of cluster boundary [1] for Low Density Separation, a widely used criteria for semi-supervised learning and clustering. Based on the derived nonparametric pairwise similarity using the plug-in classifier, we propose a new nonparametric exemplar-based clustering method with enhanced discriminative capability, whose superiority is evidenced by the experimental results.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] Similarity-Based Clustering For IoT Device Classification
    Dupont, Guillaume
    Leite, Cristoffer
    dos Santos, Daniel Ricardo
    Costante, Elisa
    den Hartog, Jerry
    Etalle, Sandro
    2021 IEEE INTERNATIONAL CONFERENCE ON OMNI-LAYER INTELLIGENT SYSTEMS (IEEE COINS 2021), 2021, : 104 - 110
  • [22] Malware Detection Using Nonparametric Bayesian Clustering and Classification Techniques
    Kao, Yimin
    Reich, Brian
    Storlie, Curtis
    Anderson, Blake
    TECHNOMETRICS, 2015, 57 (04) : 535 - 546
  • [23] Deep semi-supervised clustering based on pairwise constraints and sample similarity
    Qin, Xiao
    Yuan, Changan
    Jiang, Jianhui
    Chen, Long
    PATTERN RECOGNITION LETTERS, 2024, 178 : 1 - 6
  • [24] Pairwise Similarity Propagation Based Graph Clustering for Scalable Object Indexing and Retrieval
    Xia, Shengping
    Hancock, Edwin R.
    GRAPH-BASED REPRESENTATIONS IN PATTERN RECOGNITION, PROCEEDINGS, 2009, 5534 : 184 - +
  • [25] A randomized algorithm for pairwise clustering
    Gdalyahu, Y
    Weinshall, D
    Werman, M
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 424 - 430
  • [26] Pairwise clustering and graphical models
    Shental, N
    Zomet, A
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 185 - 192
  • [27] Dominant sets and pairwise clustering
    Pavan, Massimiliano
    Pelillo, Marcello
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (01) : 167 - 172
  • [28] Pairwise data clustering and applications
    Wu, XD
    Chen, DZ
    Mason, JJ
    Schmid, SR
    COMPUTING AND COMBINATORICS, PROCEEDINGS, 2003, 2697 : 455 - 466
  • [29] Document clustering with pairwise constraints
    Kreesuradej, W
    Suwanlamai, A
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2006, 20 (02) : 241 - 254
  • [30] Data Visualization & Clustering: Generative Topographic Mapping Similarity Assessment Allied to Graph Theory Clustering
    Escobar, Matheus de Souza
    Kaneko, Hiromasa
    Funatsu, Kimito
    FRONTIERS IN MOLECULAR DESIGN AND CHEMIAL INFORMATION SCIENCE - HERMAN SKOLNIK AWARD SYMPOSIUM 2015: JURGEN BAJORATH, 2016, 1222 : 175 - 210