Learning Multiple Nonredundant Clusterings

被引:10
|
作者
Cui, Ying [1 ]
Fern, Xiaoli Z. [2 ]
Dy, Jennifer G. [3 ]
机构
[1] Yahoo Inc, Yahoo Labs, Sunnyvale, CA USA
[2] Oregon State Univ, Sch Elect Engn & Comp Sci, Corvallis, OR 97331 USA
[3] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
关键词
Nonredundant clustering; disparate clustering; diverse clustering; orthogonalization; FEATURE-SELECTION;
D O I
10.1145/1839490.1839496
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Real-world applications often involve complex data that can be interpreted in many different ways. When clustering such data, there may exist multiple groupings that are reasonable and interesting from different perspectives. This is especially true for high-dimensional data, where different feature subspaces may reveal different structures of the data. However, traditional clustering is restricted to finding only one single clustering of the data. In this article, we propose a new clustering paradigm for exploratory data analysis: find all non-redundant clustering solutions of the data, where data points in the same cluster in one solution can belong to different clusters in other partitioning solutions. We present a framework to solve this problem and suggest two approaches within this framework: (1) orthogonal clustering, and (2) clustering in orthogonal subspaces. In essence, both approaches find alternative ways to partition the data by projecting it to a space that is orthogonal to the current solution. The first approach seeks orthogonality in the cluster space, while the second approach seeks orthogonality in the feature space. We study the relationship between the two approaches. We also combine our framework with techniques for automatically finding the number of clusters in the different solutions, and study stopping criteria for determining when all meaningful solutions are discovered. We test our framework on both synthetic and high-dimensional benchmark data sets, and the results show that indeed our approaches were able to discover varied clustering solutions that are interesting and meaningful.
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Improving Supervised Learning with Multiple Clusterings
    Wemmert, Cedric
    Forestier, Germain
    Derivaux, Sebastien
    APPLICATIONS OF SUPERVISED AND UNSUPERVISED ENSEMBLE METHODS, 2009, 245 : 135 - 149
  • [2] On Regularizing Multiple Clusterings for Ensemble Clustering by Graph Tensor Learning
    Chen, Man-Sheng
    Lin, Jia-Qi
    Wang, Chang-Dong
    Xi, Wu-Dong
    Huang, Dong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3069 - 3077
  • [3] rFILTA: relevant and nonredundant view discovery from collections of clusterings via filtering and ranking
    Yang Lei
    Nguyen Xuan Vinh
    Jeffrey Chan
    James Bailey
    Knowledge and Information Systems, 2017, 52 : 179 - 219
  • [4] rFILTA: relevant and nonredundant view discovery from collections of clusterings via filtering and ranking
    Lei, Yang
    Nguyen Xuan Vinh
    Chan, Jeffrey
    Bailey, James
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 52 (01) : 179 - 219
  • [5] Finding multiple stable clusterings
    Hu, Juhua
    Qian, Qi
    Pei, Jian
    Jin, Rong
    Zhu, Shenghuo
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 51 (03) : 991 - 1021
  • [6] Multiple Independent Subspace Clusterings
    Wang, Xing
    Wang, Jun
    Domeniconi, Carlotta
    Yu, Guoxian
    Xiao, Guoqiang
    Guo, Maozu
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5353 - 5360
  • [7] Finding Multiple Stable Clusterings
    Hu, Juhua
    Qian, Qi
    Pei, Jian
    Jin, Rong
    Zhu, Shenghuo
    2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 171 - 180
  • [8] Finding multiple stable clusterings
    Juhua Hu
    Qi Qian
    Jian Pei
    Rong Jin
    Shenghuo Zhu
    Knowledge and Information Systems, 2017, 51 : 991 - 1021
  • [9] Multiple Co-Clusterings
    Wang, Xing
    Yu, Guoxian
    Domeniconi, Carlotta
    Wang, Jun
    Yu, Zhiwen
    Zhang, Zili
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1308 - 1313
  • [10] Combining multiple weak clusterings
    Topchy, A
    Jain, AK
    Punch, W
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 331 - 338