Finding multiple stable clusterings

被引:0
|
作者
Juhua Hu
Qi Qian
Jian Pei
Rong Jin
Shenghuo Zhu
机构
[1] Simon Fraser University,School of Computing Science
[2] Alibaba Group,undefined
来源
关键词
Multi-clustering; Clustering stability; Laplacian eigengap; Feature subspace;
D O I
暂无
中图分类号
学科分类号
摘要
Multi-clustering, which tries to find multiple independent ways to partition a data set into groups, has enjoyed many applications, such as customer relationship management, bioinformatics and healthcare informatics. This paper addresses two fundamental questions in multi-clustering: How to model quality of clusterings and how to find multiple stable clusterings (MSC). We introduce to multi-clustering the notion of clustering stability based on Laplacian eigengap, which was originally used by the regularized spectral learning method for similarity matrix learning. We mathematically prove that the larger the eigengap, the more stable the clustering. Furthermore, we propose a novel multi-clustering method MSC. An advantage of our method comparing to the state-of-the-art multi-clustering methods is that our method can provide users a feature subspace to understand each clustering solution. Another advantage is that MSC does not need users to specify the number of clusters and the number of alternative clusterings, which is usually difficult for users without any guidance. Our method can heuristically estimate the number of stable clusterings in a data set. We also discuss a practical way to make MSC applicable to large-scale data. We report an extensive empirical study that clearly demonstrates the effectiveness of our method.
引用
收藏
页码:991 / 1021
页数:30
相关论文
共 50 条
  • [1] Finding multiple stable clusterings
    Hu, Juhua
    Qian, Qi
    Pei, Jian
    Jin, Rong
    Zhu, Shenghuo
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 51 (03) : 991 - 1021
  • [2] Finding Multiple Stable Clusterings
    Hu, Juhua
    Qian, Qi
    Pei, Jian
    Jin, Rong
    Zhu, Shenghuo
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 171 - 180
  • [3] Finding Alternative Clusterings Using Constraints
    Davidson, Ian
    Qi, Zijie
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 773 - 778
  • [4] On finding graph clusterings with maximum modularity
    Brandes, Ulrik
    Delling, Daniel
    Gaertler, Marco
    Goerke, Robert
    Hoefer, Martin
    Nikoloski, Zoran
    Wagner, Dorothea
    GRAPH-THEORETIC CONCEPTS IN COMPUTER SCIENCE, 2007, 4769 : 121 - +
  • [5] A Principled and Flexible Framework for Finding Alternative Clusterings
    Qi, ZiJie
    Davidson, Ian
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 717 - 725
  • [6] Stable Clusterings and the Cones of Outer Normals
    Happach, Felix
    OPERATIONS RESEARCH PROCEEDINGS 2017, 2018, : 37 - 43
  • [7] Learning Multiple Nonredundant Clusterings
    Cui, Ying
    Fern, Xiaoli Z.
    Dy, Jennifer G.
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2010, 4 (03)
  • [8] Multiple Independent Subspace Clusterings
    Wang, Xing
    Wang, Jun
    Domeniconi, Carlotta
    Yu, Guoxian
    Xiao, Guoqiang
    Guo, Maozu
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5353 - 5360
  • [9] Multiple Co-Clusterings
    Wang, Xing
    Yu, Guoxian
    Domeniconi, Carlotta
    Wang, Jun
    Yu, Zhiwen
    Zhang, Zili
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1308 - 1313
  • [10] Combining multiple weak clusterings
    Topchy, A
    Jain, AK
    Punch, W
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 331 - 338