Distribution-Based Cluster Structure Selection

被引:44
|
作者
Yu, Zhiwen [1 ,2 ]
Zhu, Xianjun [1 ]
Wong, Hau-San [3 ]
You, Jane [4 ]
Zhang, Jun [5 ]
Han, Guoqiang [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong 852, Hong Kong, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[5] Sun Yat Sen Univ, Sch Adv Comp, Guangzhou 510275, Guangdong, Peoples R China
关键词
Cluster ensemble; clustering analysis; expectation-maximization (EM); Gaussian mixture model (GMM); graph cut; hypergraph; ENSEMBLE FRAMEWORK; CONSENSUS; COMBINATION; SEARCH;
D O I
10.1109/TCYB.2016.2569529
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The objective of cluster structure ensemble is to find a unified cluster structure from multiple cluster structures obtained from different datasets. Unfortunately, not all the cluster structures contribute to the unified cluster structure. This paper investigates the problem of how to select the suitable cluster structures in the ensemble which will be summarized to a more representative cluster structure. Specifically, the cluster structure is first represented by a mixture of Gaussian distributions, the parameters of which are estimated using the expectation-maximization algorithm. Then, several distribution-based distance functions are designed to evaluate the similarity between two cluster structures. Based on the similarity comparison results, we propose a new approach, which is referred to as the distribution-based cluster structure ensemble (DCSE) framework, to find the most representative unified cluster structure. We then design a new technique, the distribution-based cluster structure selection strategy (DCSSS), to select a subset of cluster structures. Finally, we propose using a distribution-based normalized hypergraph cut algorithm to generate the final result. In our experiments, a nonparametric test is adopted to evaluate the difference between DCSE and its competitors. We adopt 20 real-world datasets obtained from the University of California, Irvine and knowledge extraction based on evolutionary learning repositories, and a number of cancer gene expression profiles to evaluate the performance of the proposed methods. The experimental results show that: 1) DCSE works well on the real-world datasets and 2) DCSE based on DCSSS can further improve the performance of the algorithm.
引用
收藏
页码:3554 / 3567
页数:14
相关论文
共 50 条
  • [21] Revisiting Distribution-Based Registration Methods
    Gupta, Himanshu
    Andreasson, Henrik
    Magnusson, Martin
    Julier, Simon
    Lilienthal, Achim J.
    2023 EUROPEAN CONFERENCE ON MOBILE ROBOTS, ECMR, 2023, : 43 - 48
  • [22] DISTRIBUTION-BASED EMOTION RECOGNITION IN CONVERSATION
    Wu, Wen
    Zhang, Chao
    Woodland, Philip C.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 860 - 867
  • [23] A distribution-based representation of Knowledge Quality
    Wang, Xiangyu
    Ban, Taiyu
    Chen, Lyuzhou
    Usman, Muhammad
    Wu, Tianhao
    Chen, Qiuju
    Chen, Huanhuan
    KNOWLEDGE-BASED SYSTEMS, 2023, 281
  • [24] Distribution-Based Recording Model for HAMR
    Maletzky, Tobias
    Staffaroni, Matteo
    Dovek, Moris M.
    IEEE TRANSACTIONS ON MAGNETICS, 2018, 54 (02)
  • [25] Distribution-based restoration of DCT coefficients
    Lakhani, G
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2000, 10 (05) : 819 - 823
  • [26] Joint Distribution-Based Test Selection for Fault Detection and Isolation Under Multiple Faults Condition
    Li, Yang
    Zio, Enrico
    Lu, Ningyun
    Wang, Xiuli
    Jiang, Bin
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2021, 70 (70)
  • [27] An Adaptive Difference Distribution-based Coding with Hierarchical Tree Structure for DNA Sequence Compression
    Dai, Wenrui
    Xiong, Hongkai
    Jiang, Xiaoqian
    Ohno-Machado, Lucila
    2013 DATA COMPRESSION CONFERENCE (DCC), 2013, : 371 - 380
  • [28] Distribution-based CFAR detectors in SAR images
    National Electronic Warfare Lab., Chengdu 610036, China
    不详
    J Syst Eng Electron, 2006, 4 (717-721):
  • [29] Gamma distribution-based sampling for imbalanced data
    Kamalov, Firuz
    Denisov, Dmitry
    KNOWLEDGE-BASED SYSTEMS, 2020, 207
  • [30] Distribution-based objectives for Markov Decision Processes
    Akshay, S.
    Genest, Blaise
    Vyas, Nikhil
    LICS'18: PROCEEDINGS OF THE 33RD ANNUAL ACM/IEEE SYMPOSIUM ON LOGIC IN COMPUTER SCIENCE, 2018, : 36 - 45