Semi-Supervised Clustering Ensemble Based on Cluster Consensus Selection

被引:0
|
作者
Liu, Yanxi [1 ,3 ]
Al-Khafaji, Ali Hussein Demin [2 ]
机构
[1] Anshan Normal Univ, Informat Ctr, Anshan, Liaoning, Peoples R China
[2] Al Mustaqbal Univ Coll, Dept Labs, Tech, Babylon, Hillah, Iraq
[3] Anshan Normal Univ, Informat Ctr, Anshan 114007, Liaoning, Peoples R China
关键词
Consensus selection; ensemble clustering; NMI; semi-supervised clustering;
D O I
10.1080/01969722.2022.2159150
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble clustering emerged as an important extension of classical clustering problems and is one of the most recent advances in unsupervised learning. Its purpose is to combine the results obtained using different algorithms by a consensus function so that the final solution is more favorable than the individual clustering algorithms. In this study, we propose a semi-supervised clustering ensemble framework using cluster consensus selection, which tries to improve the accuracy of clustering results. In general, there are two types of semi-supervised clustering algorithms, including constraint-based and metric-based. Here, the proposed ensemble clustering algorithm is equipped with a semi-supervised clustering mechanism based on pairwise constraints. Since the complexity of consensus functions scales with the number of clustering methods, processing big data for ensemble clustering is sometimes slow or impossible. Usually, all primary clusters from all clustering methods are used in the consensus function. However, the merit of clusters from different methods can be considered to improve the consensus quality. Accordingly, we propose a cluster consensus selection approach that selects a subset of meriting primary clusters to participate in the final consensus. Here, Normalized Mutual Information (NMI) is developed to measure the merit of clusters. Meanwhile, reducing the number of primary clusters in the consensus function can enable big data clustering. The proposed algorithm is very computationally efficient and provides linear complexity in clustering. Experimental results show the effectiveness of the proposed algorithm in terms of different performance metrics such as NMI, ARI and CPCC.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] MVS-based Semi-Supervised Clustering
    Yan, Yang
    Chen, Lihui
    Chan, Chee Keong
    [J]. 2013 9TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2013,
  • [42] Semi-supervised hierarchical ensemble clustering based on an innovative distance metric and constraint information
    Shen, Baohua
    Jiang, Juan
    Qian, Feng
    Li, Daoguo
    Ye, Yanming
    Ahmadi, Gholamreza
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [43] Semi-Supervised Density-Based Clustering
    Lelis, Levi
    Sander, Joerg
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 842 - 847
  • [44] Semi-supervised Classification Based on Clustering Ensembles
    Chen, Si
    Guo, Gongde
    Chen, Lifei
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PROCEEDINGS, 2009, 5855 : 629 - 638
  • [45] SEMI-SUPERVISED FUZZY CLUSTERING WITH LEARNABLE CLUSTER DEPENDENT KERNELS
    Bchir, Ouiem
    Frigui, Hichem
    Ben Ismail, Mohamed Maher
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2013, 22 (03)
  • [46] An efficient semi-supervised graph based clustering
    Viet-Vu Vu
    [J]. INTELLIGENT DATA ANALYSIS, 2018, 22 (02) : 297 - 307
  • [47] Density-based semi-supervised clustering
    Carlos Ruiz
    Myra Spiliopoulou
    Ernestina Menasalvas
    [J]. Data Mining and Knowledge Discovery, 2010, 21 : 345 - 370
  • [48] Density-based semi-supervised clustering
    Ruiz, Carlos
    Spiliopoulou, Myra
    Menasalvas, Ernestina
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 21 (03) : 345 - 370
  • [49] Semi-Supervised Clustering Based on Exemplars Constraints
    Wang, Sailan
    Yang, Zhenzhi
    Yang, Jin
    Wang, Hongjun
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (06) : 1231 - 1241
  • [50] An Adaptive Robust Semi-Supervised Clustering Framework Using Weighted Consensus of Random k-Means Ensemble
    Lai, Yongxuan
    He, Songyao
    Lin, Zhijie
    Yang, Fan
    Zhou, Qifeng
    Zhou, Xiaofang
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (05) : 1877 - 1890