Semi-supervised deep embedded clustering with pairwise constraints and subset allocation

被引:4
|
作者
Wang, Yalin [1 ]
Zou, Jiangfeng [1 ]
Wang, Kai [1 ]
Liu, Chenliang [1 ]
Yuan, Xiaofeng [1 ]
机构
[1] Cent South Univ, Sch Automat, Changsha 410083, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Semi-supervised clustering; Deep embedded clustering; Pairwise constraints; Subset allocation; Sample overlap; REPRESENTATIONS; ALGORITHM;
D O I
10.1016/j.neunet.2023.04.016
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Semi-supervised deep clustering methods attract much attention due to their excellent performance on the end-to-end clustering task. However, it is hard to obtain satisfying clustering results since many overlapping samples in industrial text datasets strongly and incorrectly influence the learning process. Existing methods incorporate prior knowledge in the form of pairwise constraints or class labels, which not only largely ignore the correlation between these two supervision information but also cause the problem of weak-supervised constraint or incorrect strong-supervised label guidance. In order to tackle these problems, we propose a semi-supervised method based on pairwise constraints and subset allocation (PCSA-DEC). We redefine the similarity-based constraint loss by forcing the similarity of samples in the same class much higher than other samples and design a novel subset allocation loss to precisely learn strong-supervised information contained in labels which consistent with unlabeled data. Experimental results on the two industrial text datasets show that our method can yield 8.2%-8.7% improvement in accuracy and 13.4%-19.8% on normalized mutual information over the state-of-the-art method. (c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:310 / 322
页数:13
相关论文
共 50 条
  • [41] On the effects of constraints in semi-supervised hierarchical clustering
    Kestler, Hans A.
    Kraus, Johann M.
    Palm, Guenther
    Schwenker, Friedhelm
    [J]. ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, PROCEEDINGS, 2006, 4087 : 57 - 66
  • [42] Semi-Supervised Clustering Based on Exemplars Constraints
    Wang, Sailan
    Yang, Zhenzhi
    Yang, Jin
    Wang, Hongjun
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (06) : 1231 - 1241
  • [43] Pairwise constraints-based semi-supervised fuzzy clustering with multi-manifold regularization
    Wang, Yingxu
    Chen, Long
    Zhou, Jin
    Li, Tianjun
    Yu, Yufeng
    [J]. INFORMATION SCIENCES, 2023, 638
  • [44] TextCSN: a Semi-Supervised Approach for Text Clustering Using Pairwise Constraints and Convolutional Siamese Network
    Vilhagra, Lucas Akayama
    Fernandes, Eraldo Rezende
    Nogueira, Bruno Magalhaes
    [J]. PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 1135 - 1142
  • [45] Centroid Neural Network with Pairwise Constraints for Semi-supervised Learning
    Minh Tran Ngoc
    Dong-Chul Park
    [J]. Neural Processing Letters, 2018, 48 : 1721 - 1747
  • [46] Semi-supervised Deep Embedded Clustering with Anomaly Detection for Semantic Frame Induction
    Yong, Zheng-Xin
    Torrent, Tiago Timponi
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3509 - 3519
  • [47] Semi-supervised Clustering via Pairwise Constrained Optimal Graph
    Nie, Feiping
    Zhang, Han
    Wang, Rong
    Li, Xuelong
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3160 - 3166
  • [48] Active semi-supervised overlapping community finding with pairwise constraints
    Alghamdi, Elham
    Greene, Derek
    [J]. APPLIED NETWORK SCIENCE, 2019, 4 (01)
  • [49] Active semi-supervised overlapping community finding with pairwise constraints
    Elham Alghamdi
    Derek Greene
    [J]. Applied Network Science, 4
  • [50] Centroid Neural Network with Pairwise Constraints for Semi-supervised Learning
    Minh Tran Ngoc
    Park, Dong-Chul
    [J]. NEURAL PROCESSING LETTERS, 2018, 48 (03) : 1721 - 1747