Semi-Supervised Clustering Ensemble Based on Cluster Consensus Selection

被引:0
|
作者
Liu, Yanxi [1 ,3 ]
Al-Khafaji, Ali Hussein Demin [2 ]
机构
[1] Anshan Normal Univ, Informat Ctr, Anshan, Liaoning, Peoples R China
[2] Al Mustaqbal Univ Coll, Dept Labs, Tech, Babylon, Hillah, Iraq
[3] Anshan Normal Univ, Informat Ctr, Anshan 114007, Liaoning, Peoples R China
关键词
Consensus selection; ensemble clustering; NMI; semi-supervised clustering;
D O I
10.1080/01969722.2022.2159150
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble clustering emerged as an important extension of classical clustering problems and is one of the most recent advances in unsupervised learning. Its purpose is to combine the results obtained using different algorithms by a consensus function so that the final solution is more favorable than the individual clustering algorithms. In this study, we propose a semi-supervised clustering ensemble framework using cluster consensus selection, which tries to improve the accuracy of clustering results. In general, there are two types of semi-supervised clustering algorithms, including constraint-based and metric-based. Here, the proposed ensemble clustering algorithm is equipped with a semi-supervised clustering mechanism based on pairwise constraints. Since the complexity of consensus functions scales with the number of clustering methods, processing big data for ensemble clustering is sometimes slow or impossible. Usually, all primary clusters from all clustering methods are used in the consensus function. However, the merit of clusters from different methods can be considered to improve the consensus quality. Accordingly, we propose a cluster consensus selection approach that selects a subset of meriting primary clusters to participate in the final consensus. Here, Normalized Mutual Information (NMI) is developed to measure the merit of clusters. Meanwhile, reducing the number of primary clusters in the consensus function can enable big data clustering. The proposed algorithm is very computationally efficient and provides linear complexity in clustering. Experimental results show the effectiveness of the proposed algorithm in terms of different performance metrics such as NMI, ARI and CPCC.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Semi-supervised consensus clustering based on closed patterns
    Yang, Tianshu
    Pasquier, Nicolas
    Precioso, Frederic
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 235
  • [2] Semi-supervised cluster ensemble based on density peaks
    Mustafa, Kadhim
    Wang, Hongjun
    Zhou, Yuan
    Song, Jian
    [J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 645 - 651
  • [3] Semi-Supervised Classification with Cluster Ensemble
    Berikov, Vladimir
    Karaev, Nikita
    Tewari, Ankit
    [J]. 2017 INTERNATIONAL MULTI-CONFERENCE ON ENGINEERING, COMPUTER AND INFORMATION SCIENCES (SIBIRCON), 2017, : 245 - 250
  • [4] Double Selection Based Semi-Supervised Clustering Ensemble for Tumor Clustering from Gene Expression Profiles
    Yu, Zhiwen
    Chen, Hongsheng
    You, Jane
    Wong, Hau-San
    Liu, Jiming
    Li, Le
    Han, Guoqiang
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (04) : 727 - 740
  • [5] Semi-supervised clustering ensemble based on genetic algorithm model
    Bi, Sheng
    Li, Xiangli
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 55851 - 55865
  • [6] Semi-supervised Selective Clustering Ensemble based on constraint information
    Ma, Tinghuai
    Zhang, Zheng
    Guo, Lei
    Wang, Xin
    Qian, Yurong
    Al-Nabhan, Najla
    [J]. NEUROCOMPUTING, 2021, 462 : 412 - 425
  • [7] Semi-Supervised Ensemble Clustering Based on Selected Constraint Projection
    Yu, Zhiwen
    Luo, Peinan
    Liu, Jiming
    Wong, Hau-San
    You, Jane
    Han, Guoqiang
    Zhang, Jun
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (12) : 2394 - 2407
  • [8] Semi-supervised clustering ensemble based on genetic algorithm model
    Sheng Bi
    Xiangli Li
    [J]. Multimedia Tools and Applications, 2024, 83 : 55851 - 55865
  • [9] Convergence Analysis of Semi-supervised Clustering Ensemble
    Chen, Dahai
    Yang, Yan
    Wang, Hongjun
    Mahmood, Amjad
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 783 - 788
  • [10] Adaptive Regularized Semi-Supervised Clustering Ensemble
    Luo, Rui
    Yu, Zhiwen
    Cao, Wenming
    Liu, Cheng
    Wong, Hau-San
    Chen, C. L. Philip
    [J]. IEEE ACCESS, 2020, 8 : 17926 - 17934