A Lagrangian-based score for assessing the quality of pairwise constraints in semi-supervised clustering

被引:3
|
作者
Randel, Rodrigo [1 ,2 ]
Aloise, Daniel [1 ,2 ]
Blanchard, Simon J. [3 ]
Hertz, Alain [2 ,4 ]
机构
[1] Polytech Montreal, Dept Genie Informat & Genie Logiciel, Montreal, PQ, Canada
[2] GERAD, Montreal, PQ, Canada
[3] Georgetown Univ, McDonough Sch Business, Washington, DC USA
[4] Polytech Montreal, Dept Math & Genie Ind, Montreal, PQ, Canada
基金
瑞典研究理事会; 加拿大自然科学与工程研究理事会;
关键词
Clustering; Semi-supervised; Pairwise constraints; Constraint selection; Lagrangian duality; MODEL;
D O I
10.1007/s10618-021-00794-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering algorithms help identify homogeneous subgroups from data. In some cases, additional information about the relationship among some subsets of the data exists. When using a semi-supervised clustering algorithm, an expert may provide additional information to constrain the solution based on that knowledge and, in doing so, guide the algorithm to a more useful and meaningful solution. Such additional information often takes the form of a cannot-link constraint (i.e., two data points cannot be part of the same cluster) or a must-link constraint (i.e., two data points must be part of the same cluster). A key challenge for users of such constraints in semi-supervised learning algorithms, however, is that the addition of inaccurate or conflicting constraints can decrease accuracy and little is known about how to detect whether expert-imposed constraints are likely incorrect. In the present work, we introduce a method to score each must-link and cannot-link pairwise constraint as likely incorrect. Using synthetic experimental examples and real data, we show that the resulting impact score can successfully identify individual constraints that should be removed or revised.
引用
收藏
页码:2341 / 2368
页数:28
相关论文
共 50 条
  • [1] A Lagrangian-based score for assessing the quality of pairwise constraints in semi-supervised clustering
    Rodrigo Randel
    Daniel Aloise
    Simon J. Blanchard
    Alain Hertz
    [J]. Data Mining and Knowledge Discovery, 2021, 35 : 2341 - 2368
  • [2] Semi-supervised Clustering with Pairwise and Size Constraints
    Zhang, Shaohong
    Wong, Hau-San
    Xie, Dongqing
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 2450 - 2457
  • [3] Semi-supervised DenPeak Clustering with Pairwise Constraints
    Ren, Yazhou
    Hu, Xiaohui
    Shi, Ke
    Yu, Guoxian
    Yao, Dezhong
    Xu, Zenglin
    [J]. PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 837 - 850
  • [4] Active semi-supervised spectral clustering based on pairwise constraints
    Wang, Na
    Li, Xia
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (01): : 172 - 176
  • [5] Effective semi-supervised graph clustering with pairwise constraints
    Chen, Jingwei
    Xie, Shiyu
    Yang, Hui
    Nie, Feiping
    [J]. INFORMATION SCIENCES, 2024, 681
  • [6] Semi-Supervised Maximum Margin Clustering with Pairwise Constraints
    Zeng, Hong
    Cheung, Yiu-Ming
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (05) : 926 - 939
  • [7] Deep semi-supervised clustering based on pairwise constraints and sample similarity
    Qin, Xiao
    Yuan, Changan
    Jiang, Jianhui
    Chen, Long
    [J]. PATTERN RECOGNITION LETTERS, 2024, 178 : 1 - 6
  • [8] Research of semi-supervised spectral clustering algorithm based on pairwise constraints
    Ding, Shifei
    Jia, Hongjie
    Zhang, Liwen
    Jin, Fengxiang
    [J]. NEURAL COMPUTING & APPLICATIONS, 2014, 24 (01): : 211 - 219
  • [9] Research of semi-supervised spectral clustering algorithm based on pairwise constraints
    Shifei Ding
    Hongjie Jia
    Liwen Zhang
    Fengxiang Jin
    [J]. Neural Computing and Applications, 2014, 24 : 211 - 219
  • [10] A classification-based approach to semi-supervised clustering with pairwise constraints
    Smieja, Marek
    Struski, Lukasz
    Figueiredo, Mario A. T.
    [J]. NEURAL NETWORKS, 2020, 127 : 193 - 203