A Lagrangian-based score for assessing the quality of pairwise constraints in semi-supervised clustering

被引:3
|
作者
Randel, Rodrigo [1 ,2 ]
Aloise, Daniel [1 ,2 ]
Blanchard, Simon J. [3 ]
Hertz, Alain [2 ,4 ]
机构
[1] Polytech Montreal, Dept Genie Informat & Genie Logiciel, Montreal, PQ, Canada
[2] GERAD, Montreal, PQ, Canada
[3] Georgetown Univ, McDonough Sch Business, Washington, DC USA
[4] Polytech Montreal, Dept Math & Genie Ind, Montreal, PQ, Canada
基金
瑞典研究理事会; 加拿大自然科学与工程研究理事会;
关键词
Clustering; Semi-supervised; Pairwise constraints; Constraint selection; Lagrangian duality; MODEL;
D O I
10.1007/s10618-021-00794-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering algorithms help identify homogeneous subgroups from data. In some cases, additional information about the relationship among some subsets of the data exists. When using a semi-supervised clustering algorithm, an expert may provide additional information to constrain the solution based on that knowledge and, in doing so, guide the algorithm to a more useful and meaningful solution. Such additional information often takes the form of a cannot-link constraint (i.e., two data points cannot be part of the same cluster) or a must-link constraint (i.e., two data points must be part of the same cluster). A key challenge for users of such constraints in semi-supervised learning algorithms, however, is that the addition of inaccurate or conflicting constraints can decrease accuracy and little is known about how to detect whether expert-imposed constraints are likely incorrect. In the present work, we introduce a method to score each must-link and cannot-link pairwise constraint as likely incorrect. Using synthetic experimental examples and real data, we show that the resulting impact score can successfully identify individual constraints that should be removed or revised.
引用
收藏
页码:2341 / 2368
页数:28
相关论文
共 50 条
  • [41] Research of semi-supervised spectral clustering based on constraints expansion
    Shifei Ding
    Bingjuan Qi
    Hongjie Jia
    Hong Zhu
    Liwen Zhang
    [J]. Neural Computing and Applications, 2013, 22 : 405 - 410
  • [42] Research of semi-supervised spectral clustering based on constraints expansion
    Ding, Shifei
    Qi, Bingjuan
    Jia, Hongjie
    Zhu, Hong
    Zhang, Liwen
    [J]. NEURAL COMPUTING & APPLICATIONS, 2013, 22 : S405 - S410
  • [43] Semi-Supervised EEG Clustering With Multiple Constraints
    Dai, Chenglong
    Wu, Jia
    Monaghan, Jessica J. M.
    Li, Guanghui
    Peng, Hao
    Becker, Stefanie I.
    McAlpine, David
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) : 8529 - 8544
  • [44] Active Learning of Constraints for Semi-Supervised Clustering
    Xiong, Sicheng
    Azimi, Javad
    Fern, Xiaoli Z.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) : 43 - 54
  • [45] On the effects of constraints in semi-supervised hierarchical clustering
    Kestler, Hans A.
    Kraus, Johann M.
    Palm, Guenther
    Schwenker, Friedhelm
    [J]. ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, PROCEEDINGS, 2006, 4087 : 57 - 66
  • [46] TextCSN: a Semi-Supervised Approach for Text Clustering Using Pairwise Constraints and Convolutional Siamese Network
    Vilhagra, Lucas Akayama
    Fernandes, Eraldo Rezende
    Nogueira, Bruno Magalhaes
    [J]. PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 1135 - 1142
  • [47] Semi-supervised Overlapping Community Finding Based on Label Propagation with Pairwise Constraints
    Alghamdi, Elham
    Greene, Derek
    [J]. COMPLEX NETWORKS AND THEIR APPLICATIONS VII, VOL 1, 2019, 812 : 316 - 327
  • [48] Centroid Neural Network with Pairwise Constraints for Semi-supervised Learning
    Minh Tran Ngoc
    Dong-Chul Park
    [J]. Neural Processing Letters, 2018, 48 : 1721 - 1747
  • [49] Semi-supervised Clustering via Pairwise Constrained Optimal Graph
    Nie, Feiping
    Zhang, Han
    Wang, Rong
    Li, Xuelong
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3160 - 3166
  • [50] Active semi-supervised overlapping community finding with pairwise constraints
    Alghamdi, Elham
    Greene, Derek
    [J]. APPLIED NETWORK SCIENCE, 2019, 4 (01)