Stratified Feature Sampling for Semi-Supervised Ensemble Clustering

被引:3
|
作者
Tian, Jialin [1 ]
Ren, Yazhou [1 ]
Cheng, Xiang [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
[2] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24060 USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Constraint propagation; ensemble clustering; high dimensional data; semi-supervised learning; stratified feature sampling;
D O I
10.1109/ACCESS.2019.2939581
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble Clustering (EC), which seeks to generate a consensus clustering by integrating multiple base clusterings, has attracted increasing attentions. However, traditional EC methods typically have three main limitations: (1) High dimensional data present a huge challenge to ensemble clustering methods. (2) Most EC algorithms can not use prior information, e.g., pairwise constraints, to enhance the clustering performance. (3) Even in existing semi-supervised ensemble clustering methods, prior information is not sufficiently used, e.g., only used in generating base clusterings. To alleviate these problems, we propose Stratified Feature Sampling for Semi-Supervised Ensemble Clustering ((SFSEC)-E-3). Firstly, we develop a novel stratified feature sampling method, which can cope with high dimensional data, guarantee the diversity of base clusterings, and reduce the risk that some features are not selected at the same time. Secondly, semi-supervised clustering, i.e., constraint propagation, is applied to obtain base clusterings. Finally, we propose to utilize prior information in both the base clustering generating process and the consensus process, which guarantees that prior information is sufficiently used. We conduct a series of experiments on ten real-world data sets to demonstrate the effectiveness of the proposed model.
引用
收藏
页码:128669 / 128675
页数:7
相关论文
共 50 条
  • [1] A semi-supervised hierarchical ensemble clustering framework based on a novel similarity metric and stratified feature sampling
    Shi, Hui
    Peng, Qiang
    Xie, Zhiming
    Wang, Jian
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
  • [2] Semi-supervised spectral clustering ensemble
    1600, ICIC Express Letters Office (10):
  • [3] Convergence Analysis of Semi-supervised Clustering Ensemble
    Chen, Dahai
    Yang, Yan
    Wang, Hongjun
    Mahmood, Amjad
    2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2013, : 783 - 788
  • [4] Adaptive Regularized Semi-Supervised Clustering Ensemble
    Luo, Rui
    Yu, Zhiwen
    Cao, Wenming
    Liu, Cheng
    Wong, Hau-San
    Chen, C. L. Philip
    IEEE ACCESS, 2020, 8 : 17926 - 17934
  • [5] Semi-Supervised Fuzzy Clustering with Feature Discrimination
    Li, Longlong
    Garibaldi, Jonathan M.
    He, Dongjian
    Wang, Meili
    PLOS ONE, 2015, 10 (09):
  • [6] Constraint projections for semi-supervised spectral clustering ensemble
    Yang, Jingya
    Sun, Linfu
    Wu, Qishi
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (20):
  • [7] RAPID CLUSTERING WITH SEMI-SUPERVISED ENSEMBLE DENSITY CENTERS
    Kadhim, Mustafa R.
    Tian, Wenhong
    Khan, Tahseen
    2019 16TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICWAMTIP), 2019, : 230 - 235
  • [8] Semi-supervised hierarchical clustering ensemble and its application
    Xiao, Wenchao
    Yang, Yan
    Wang, Hongjun
    Li, Tianrui
    Xing, Huanlai
    NEUROCOMPUTING, 2016, 173 : 1362 - 1376
  • [9] A semi-supervised feature ranking method with ensemble learning
    Bellal, Fazia
    Elghazel, Haytham
    Aussem, Alex
    PATTERN RECOGNITION LETTERS, 2012, 33 (10) : 1426 - 1433
  • [10] Incremental Semi-Supervised Clustering Ensemble for High Dimensional Data Clustering
    Yu, Zhiwen
    Luo, Peinan
    You, Jane
    Wong, Hau-San
    Leung, Hareton
    Wu, Si
    Zhang, Jun
    Han, Guoqiang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (03) : 701 - 714