Constraint selection by committee: an ensemble approach to identifying informative constraints for semi-supervised clustering

被引:0
|
作者
Greene, Derek [1 ]
Cunningham, Padraig [1 ]
机构
[1] Univ Coll Dublin, Dublin 2, Ireland
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A number of clustering algorithms have been proposed for use in tasks where a limited degree of supervision is available. This prior knowledge is frequently provided in the form of pairwise must-link and cannot-link constraints. While the incorporation of pairwise supervision has the potential to improve clustering accuracy, the composition and cardinality of the constraint sets can significantly impact upon the level of improvement. We demonstrate that it is often possible to correctly "guess" a large number of constraints without supervision from the co-associations between pairs of objects in an ensemble of clusterings. Along the same lines, we establish that constraints based on pairs with uncertain co-associations are particularly informative, if known. An evaluation on text data shows that this provides an effective criterion for identifying constraints, leading to a reduction in the level of supervision required to direct a clustering algorithm to an accurate solution.
引用
收藏
页码:140 / +
页数:2
相关论文
共 50 条
  • [41] Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering
    Yu, Zhiwen
    Luo, Peinan
    You, Jane
    Wong, Hau-San
    Leung, Hareton
    Wu, Si
    Zhang, Jun
    Han, Guoqiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (03) : 701 - 714
  • [42] Incremental Semi-supervised Clustering Ensemble for High Dimensional Data Clustering
    Yu, Zhiwen
    Luo, Peinan
    Wu, Si
    Han, Guoqiang
    You, Jane
    Leung, Hareton
    Wong, Hau-San
    Zhang, Jun
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1484 - 1485
  • [43] Semi-Supervised Clustering Algorithms Through Active Constraints
    Almazroi, Abdulwahab Ali
    Atwa, Walid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 338 - 345
  • [44] A mixed ensemble approach for the semi-supervised problem
    Dimitriadou, E
    Weingessel, A
    Hornik, K
    ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 571 - 576
  • [45] Active Learning of Constraints for Semi-supervised Text Clustering
    Huang, Ruizhang
    Lam, Wai
    Zhang, Zhigang
    PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 113 - 124
  • [46] Effective semi-supervised graph clustering with pairwise constraints
    Chen, Jingwei
    Xie, Shiyu
    Yang, Hui
    Nie, Feiping
    INFORMATION SCIENCES, 2024, 681
  • [47] Semi-Supervised Maximum Margin Clustering with Pairwise Constraints
    Zeng, Hong
    Cheung, Yiu-Ming
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (05) : 926 - 939
  • [48] An Efficient Semi-Supervised Clustering Algorithm with Sequential Constraints
    Yi, Jinfeng
    Zhang, Lijun
    Yang, Tianbao
    Liu, Wei
    Wang, Jun
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1405 - 1414
  • [49] SEMI-SUPERVISED EVALUATION OF CONSTRAINT SCORES FOR FEATURE SELECTION
    Kalakech, Mariam
    Biela, Philippe
    Hamad, Denis
    Macaire, Ludovic
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : 175 - 182
  • [50] A new feature selection approach based on ensemble methods in semi-supervised classification
    Nesma Settouti
    Mohamed Amine Chikh
    Vincent Barra
    Pattern Analysis and Applications, 2017, 20 : 673 - 686