Distribution-Based Cluster Structure Selection

被引:44
|
作者
Yu, Zhiwen [1 ,2 ]
Zhu, Xianjun [1 ]
Wong, Hau-San [3 ]
You, Jane [4 ]
Zhang, Jun [5 ]
Han, Guoqiang [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong 852, Hong Kong, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
[5] Sun Yat Sen Univ, Sch Adv Comp, Guangzhou 510275, Guangdong, Peoples R China
关键词
Cluster ensemble; clustering analysis; expectation-maximization (EM); Gaussian mixture model (GMM); graph cut; hypergraph; ENSEMBLE FRAMEWORK; CONSENSUS; COMBINATION; SEARCH;
D O I
10.1109/TCYB.2016.2569529
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The objective of cluster structure ensemble is to find a unified cluster structure from multiple cluster structures obtained from different datasets. Unfortunately, not all the cluster structures contribute to the unified cluster structure. This paper investigates the problem of how to select the suitable cluster structures in the ensemble which will be summarized to a more representative cluster structure. Specifically, the cluster structure is first represented by a mixture of Gaussian distributions, the parameters of which are estimated using the expectation-maximization algorithm. Then, several distribution-based distance functions are designed to evaluate the similarity between two cluster structures. Based on the similarity comparison results, we propose a new approach, which is referred to as the distribution-based cluster structure ensemble (DCSE) framework, to find the most representative unified cluster structure. We then design a new technique, the distribution-based cluster structure selection strategy (DCSSS), to select a subset of cluster structures. Finally, we propose using a distribution-based normalized hypergraph cut algorithm to generate the final result. In our experiments, a nonparametric test is adopted to evaluate the difference between DCSE and its competitors. We adopt 20 real-world datasets obtained from the University of California, Irvine and knowledge extraction based on evolutionary learning repositories, and a number of cancer gene expression profiles to evaluate the performance of the proposed methods. The experimental results show that: 1) DCSE works well on the real-world datasets and 2) DCSE based on DCSSS can further improve the performance of the algorithm.
引用
收藏
页码:3554 / 3567
页数:14
相关论文
共 50 条
  • [41] Towards distribution-based calibration for traffic simulation
    Antoniou, Constantinos
    Gikas, Vassilis
    Papathanasopoulou, Vasileia
    Mpimis, Thanassis
    Markou, Ioulia
    Perakis, Haris
    2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2014, : 786 - 791
  • [42] Error Analysis on Distribution-based Frequency Estimator
    Ishizaka, Asami
    Nitta, Masuhiro
    Kato, Kiyotaka
    INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2010), 2010, : 255 - 260
  • [43] Distribution-based anomaly detection in network traffic
    Coluccia, Angelo
    D'Alconzo, Alessandro
    Ricciato, Fabio
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013, 7754 : 202 - 216
  • [44] Chain code distribution-based image retrieval
    Sun Junding
    Wu Xiaosheng
    IIH-MSP: 2006 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, PROCEEDINGS, 2006, : 139 - +
  • [45] Destination Prediction by Trajectory Distribution-Based Model
    Besse, Philippe C.
    Guillouet, Brendan
    Loubes, Jean-Michel
    Royer, Francois
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (08) : 2470 - 2481
  • [46] Distribution-Based Iron Road Structure Recognition Method Using Automotive Radar Sensor
    Lee, Seongwook
    Kim, Seong-Cheol
    2018 IEEE RADAR CONFERENCE (RADARCONF18), 2018, : 212 - 217
  • [47] Distribution-based PV module degradation model
    Lai, Guangzhi
    Wang, Dong
    Wang, Ziyue
    Fan, Fu
    Wang, Qihang
    Wang, Ruyi
    ENERGY SCIENCE & ENGINEERING, 2023, 11 (03) : 1219 - 1228
  • [48] Distribution-Based Calibration of a Stormwater Quality Model
    Leutnant, Dominik
    Muschalla, Dirk
    Uhl, Mathias
    WATER, 2018, 10 (08)
  • [49] Optimizing distribution-based matching by random subsampling
    Leung, Alex Po
    Gong, Shaogang
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 1627 - +
  • [50] Linear distribution-based retrieval of underground voids
    Liseno, A
    Colella, N
    Pierri, R
    Soldovieri, F
    IGARSS 2003: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS I - VII, PROCEEDINGS: LEARNING FROM EARTH'S SHAPES AND SIZES, 2003, : 3848 - 3850