Semi-supervised clustering methods

被引:110
|
作者
Bair, Eric [1 ,2 ]
机构
[1] Univ N Carolina, Dept Endodont, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
关键词
cluster analysis; high-dimensional data; semi-supervised methods; machine learning;
D O I
10.1002/wics.1270
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning that there is no outcome variable nor is anything known about the relationship between the observations in the data set. In many situations, however, information about the clusters is available in addition to the values of the features. For example, the cluster labels of some observations may be known, or certain observations may be known to belong to the same cluster. In other cases, one may wish to identify clusters that are associated with a particular outcome variable. This review describes several clustering algorithms (known as 'semi-supervised clustering' methods) that can be applied in these situations. The majority of these methods are modifications of the popular k-means clustering method, and several of them will be described in detail. A brief description of some other semi-supervised clustering algorithms is also provided. (C) 2013 Wiley Periodicals, Inc.
引用
收藏
页码:349 / 361
页数:13
相关论文
共 50 条
  • [41] A Novel Initialization Method for Semi-supervised Clustering
    Dang, Yanzhong
    Xuan, Zhaoguo
    Rong, Lili
    Liu, Ming
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2010, 6291 : 317 - 328
  • [42] Semi-supervised Clustering with Deep Metric Learning
    Li, Xiaocui
    Yin, Hongzhi
    Zhou, Ke
    Chen, Hongxu
    Sadiq, Shazia
    Zhou, Xiaofang
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 383 - 386
  • [43] Semi-supervised Clustering with Pairwise and Size Constraints
    Zhang, Shaohong
    Wong, Hau-San
    Xie, Dongqing
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 2450 - 2457
  • [44] Semi-supervised Clustering Using Heterogeneous Dissimilarities
    Martin-Merino, Manuel
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2010, 6218 : 375 - 384
  • [45] Semi-Supervised Clustering for Short Answer Scoring
    Horbach, Andrea
    Pinkal, Manfred
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4065 - 4071
  • [46] Semi-Supervised Density-Based Clustering
    Lelis, Levi
    Sander, Joerg
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 842 - 847
  • [47] Semi-Supervised Clustering with Partial Background Information
    Gao, Jing
    Tan, Pang-Ning
    Cheng, Haibin
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 489 - 493
  • [48] Shape annotation by semi-supervised fuzzy clustering
    Castellano, G.
    Fanelli, A. M.
    Torsello, M. A.
    [J]. INFORMATION SCIENCES, 2014, 289 : 148 - 161
  • [49] Semi-supervised Classification Based on Clustering Ensembles
    Chen, Si
    Guo, Gongde
    Chen, Lifei
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PROCEEDINGS, 2009, 5855 : 629 - 638
  • [50] Adaptive Ensembling of Semi-Supervised Clustering Solutions
    Yu, Zhiwen
    Kuang, Zongqiang
    Liu, Jiming
    Chen, Hongsheng
    Zhang, Jun
    You, Jane
    Wong, Hau-San
    Han, Guoqiang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (08) : 1577 - 1590