On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

被引:0
|
作者
Yang, Yingzhen [1 ]
Liang, Feng [1 ]
Yan, Shuicheng [2 ]
Wang, Zhangyang [1 ]
Huang, Thomas S. [1 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Natl Univ Singapore, Singapore 117576, Singapore
基金
美国国家科学基金会;
关键词
RATES; CONSISTENCY; UNIFORM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pairwise clustering methods partition the data space into clusters by the pairwise similarity between data points. The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and search for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. We consider two nonparametric classifiers in this framework, i.e. the nearest neighbor classifier and the plug-in classifier. Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering. Under uniform distribution, the nonparametric similarity terms induced by both unsupervised classifiers exhibit a well known form of kernel similarity. We also prove that the generalization error bound for the unsupervised plug-in classifier is asymptotically equal to the weighted volume of cluster boundary [1] for Low Density Separation, a widely used criteria for semi-supervised learning and clustering. Based on the derived nonparametric pairwise similarity using the plug-in classifier, we propose a new nonparametric exemplar-based clustering method with enhanced discriminative capability, whose superiority is evidenced by the experimental results.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Discriminative Bayesian Nonparametric Clustering
    Nguyen, Vu
    Phung, Dinh
    Le, Trung
    Bui, Hung
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2550 - 2556
  • [42] Nonparametric clustering for image segmentation
    Menardi, Giovanna
    STATISTICAL ANALYSIS AND DATA MINING, 2020, 13 (01) : 83 - 97
  • [43] Nonparametric clustering of seismic events
    Adelfio, Giada
    Chiodi, Marcello
    De Luca, Luciana
    Luzio, Dario
    DATA ANALYSIS, CLASSIFICATION AND THE FORWARD SEARCH, 2006, : 397 - +
  • [44] Nonparametric Bayesian Clustering Ensembles
    Wang, Pu
    Domeniconi, Carlotta
    Laskey, Kathryn Blackmond
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2010, 6323 : 435 - 450
  • [45] UNIC: A fast nonparametric clustering
    Leopold, Nadiia
    Rose, Oliver
    PATTERN RECOGNITION, 2020, 100 (100)
  • [46] Clustering by pattern similarity
    Wang, Haixun
    Pei, Jian
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2008, 23 (04) : 481 - 496
  • [47] NONPARAMETRIC CLUSTERING SCHEME FOR LANDSAT
    NARENDRA, PM
    GOLDBERG, M
    PATTERN RECOGNITION, 1977, 9 (04) : 207 - 215
  • [48] Clustering by Pattern Similarity
    王海勋
    裴健
    JournalofComputerScience&Technology, 2008, (04) : 481 - 496
  • [49] Nonparametric Clustering of Functional Data
    Wang, Haiyan
    Neill, James
    Miller, Forrest
    STATISTICS AND ITS INTERFACE, 2008, 1 (01) : 47 - 62
  • [50] VARIATION ON A NONPARAMETRIC CLUSTERING METHOD
    JOHNSTON, B
    BAILEY, T
    DUBES, R
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (04) : 400 - 408