Semi-supervised model-based clustering with positive and negative constraints

被引:0
|
作者
Volodymyr Melnykov
Igor Melnykov
Semhar Michael
机构
[1] University of Alabama,Department of Information Systems, Statistics, and Management Science
[2] Colorado State University-Pueblo,Department of Mathematics and Physics
关键词
Semi-supervised clustering; Model-based clustering ; Finite mixture models; Positive and negative constraints; BIC; 62H30;
D O I
暂无
中图分类号
学科分类号
摘要
Cluster analysis is a popular technique in statistics and computer science with the objective of grouping similar observations in relatively distinct groups generally known as clusters. Semi-supervised clustering assumes that some additional information about group memberships is available. Under the most frequently considered scenario, labels are known for some portion of data and unavailable for the rest of observations. In this paper, we discuss a general type of semi-supervised clustering defined by so called positive and negative constraints. Under positive constraints, some data points are required to belong to the same cluster. On the contrary, negative constraints specify that particular points must represent different data groups. We outline a general framework for semi-supervised clustering with constraints naturally incorporating the additional information into the EM algorithm traditionally used in mixture modeling and model-based clustering. The developed methodology is illustrated on synthetic and classification datasets. A dendrochronology application is considered and thoroughly discussed.
引用
收藏
页码:327 / 349
页数:22
相关论文
共 50 条
  • [21] Semi-Supervised Clustering Fingerprint Positioning Algorithm Based on Distance Constraints
    Ying Xia
    Zhongzhao Zhang
    Lin Ma
    Yao Wang
    [J]. Journal of Harbin Institute of Technology(New series), 2015, (06) : 55 - 61
  • [22] A classification-based approach to semi-supervised clustering with pairwise constraints
    Smieja, Marek
    Struski, Lukasz
    Figueiredo, Mario A. T.
    [J]. NEURAL NETWORKS, 2020, 127 : 193 - 203
  • [23] Performance Evaluation of Constraints in Graph-Based Semi-supervised Clustering
    Yoshida, Tetsuya
    [J]. ACTIVE MEDIA TECHNOLOGY, 2010, 6335 : 138 - 149
  • [24] Research of semi-supervised spectral clustering algorithm based on pairwise constraints
    Ding, Shifei
    Jia, Hongjie
    Zhang, Liwen
    Jin, Fengxiang
    [J]. NEURAL COMPUTING & APPLICATIONS, 2014, 24 (01): : 211 - 219
  • [25] Semi-Supervised Clustering Fingerprint Positioning Algorithm Based on Distance Constraints
    Ying Xia
    Zhongzhao Zhang
    Lin Ma
    Yao Wang
    [J]. Journal of Harbin Institute of Technology., 2015, 22 (06) - 61
  • [26] Research of semi-supervised spectral clustering algorithm based on pairwise constraints
    Shifei Ding
    Hongjie Jia
    Liwen Zhang
    Fengxiang Jin
    [J]. Neural Computing and Applications, 2014, 24 : 211 - 219
  • [27] Semi-Supervised Clustering Algorithms Through Active Constraints
    Almazroi, Abdulwahab Ali
    Atwa, Walid
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 338 - 345
  • [28] Active Learning of Constraints for Semi-supervised Text Clustering
    Huang, Ruizhang
    Lam, Wai
    Zhang, Zhigang
    [J]. PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 113 - 124
  • [29] Semi-Supervised Maximum Margin Clustering with Pairwise Constraints
    Zeng, Hong
    Cheung, Yiu-Ming
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (05) : 926 - 939
  • [30] Effective semi-supervised graph clustering with pairwise constraints
    Chen, Jingwei
    Xie, Shiyu
    Yang, Hui
    Nie, Feiping
    [J]. INFORMATION SCIENCES, 2024, 681