A comprehensive study of clustering ensemble weighting based on cluster quality and diversity

被引:60
|
作者
Nazari, Ahmad [1 ]
Dehghan, Ayob [1 ]
Nejatian, Samad [2 ,3 ]
Rezaie, Vahideh [3 ,4 ]
Parvin, Hamid [1 ,5 ]
机构
[1] Islamic Azad Univ, Yasooj Branch, Dept Comp Engn, Yasuj, Iran
[2] Islamic Azad Univ, Yasooj Branch, Dept Elect Engn, Yasuj, Iran
[3] Islamic Azad Univ, Yasooj Branch, Young Researchers & Elite Club, Yasuj, Iran
[4] Islamic Azad Univ, Yasooj Branch, Dept Math, Yasuj, Iran
[5] Islamic Azad Univ, Young Researchers & Elite Club, Nourabad Mamasani Branch, Nourabad, Mamasani, Iran
关键词
Data clustering; Clustering ensemble; Consensus function; Weighting; COMBINING MULTIPLE CLUSTERINGS; TRANSFER DISTANCE; SELECTION; CONSENSUS; PARTITIONS;
D O I
10.1007/s10044-017-0676-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering as a major task in data mining is responsible for discovering hidden patterns in unlabeled datasets. Finding the best clustering is also considered as one of the most challenging problems in data mining. Due to the problem complexity and the weaknesses of primary clustering algorithm, a large part of research has been directed toward ensemble clustering methods. Ensemble clustering aggregates a pool of base clusterings and produces an output clustering that is also named consensus clustering. The consensus clustering is usually better clustering than the output clusterings of the basic clustering algorithms. However, lack of quality in base clusterings makes their consensus clustering weak. In spite of some researches in selection of a subset of high quality base clusterings based on a clustering assessment metric, cluster-level selection has been always ignored. In this paper, a new clustering ensemble framework has been proposed based on cluster-level weighting. The certainty amount that the given ensemble has about a cluster is considered as the reliability of that cluster. The certainty amount that the given ensemble has about a cluster is computed by the accretion amount of that cluster by the ensemble. Then by selecting the best clusters and assigning a weight to each selected cluster based on its reliability, the final ensemble is created. After that, the paper proposes cluster-level weighting co-association matrix instead of traditional co-association matrix. Then, two consensus functions have been introduced and used for production of the consensus partition. The proposed framework completely overshadows the state-of-the-art clustering ensemble methods experimentally.
引用
收藏
页码:133 / 145
页数:13
相关论文
共 50 条
  • [11] Ensemble Clustering with Novel Weighting Strategy
    Sun, Yao
    Jia, Hong
    Huang, Jiwu
    2018 14TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2018, : 155 - 159
  • [12] A Clustering Ensemble Method Based on Cluster Selection and Cluster Splitting
    Tang, Yuyang
    Liu, Xiabi
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 54 - 58
  • [13] Ensemble based on Accuracy and Diversity Weighting for Evolving Data Streams
    Sun, Yange
    Shao, Han
    Zhang, Bencai
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (01) : 90 - 96
  • [14] Data and cluster weighting in target selection based on fuzzy clustering
    Kaymak, U
    FUZZY SETS AND SYSTEMS - IFSA 2003, PROCEEDINGS, 2003, 2715 : 568 - 575
  • [15] Credit Scoring Using Ensemble Classification Based on Variable Weighting Clustering
    Ding, Haiyang
    Zhang, Peng
    Lu, Tun
    Gu, Hansu
    Gu, Ning
    2017 IEEE 21ST INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2017, : 509 - 514
  • [16] Categorical Data Clustering Based on Cluster Ensemble Process
    Veeraiah, D.
    Vasumathi, D.
    PROCEEDINGS OF THE INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2015, VOL 2, 2016, 439 : 101 - 111
  • [17] A Novel Cluster Ensemble based on a Single Clustering Algorithm
    Khan, Tahseen
    Tian, Wenhong
    Kadhim, Mustafa R.
    Buyya, Rajkumar
    PROCEEDINGS OF THE 2021 16TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2021, : 127 - 135
  • [18] Tumor Clustering based on Hybrid Cluster Ensemble Framework
    Yu, Zhiwen
    You, Jane
    Chen, Hantao
    Li, Le
    Wang, Xiaowei
    2012 INTERNATIONAL CONFERENCE ON COMPUTERIZED HEALTHCARE (ICCH), 2012, : 99 - +
  • [19] A clustering ensemble algorithm based on cluster-mode
    Jia, Rui-Yu
    Geng, Jin-Wei
    International Journal of Digital Content Technology and its Applications, 2012, 6 (19) : 17 - 24
  • [20] A Comparative Study of Selective Cluster Ensemble for Document Clustering
    Xu, Sen
    Gao, Jun
    Xu, Xiufang
    Li, Xianfeng
    Yu, Hualong
    2015 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS IHMSC 2015, VOL I, 2015, : 308 - 311