On constructing an optimal consensus clustering from multiple clusterings

被引:7
|
作者
Berman, Piotr [1 ]
DasGupta, Bhaskar
Kao, Ming-Yang
Wang, Jie
机构
[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
[2] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[3] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
[4] Univ Massachusetts, Dept Comp Sci, Lowell, MA 01854 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
computational complexity; approximation algorithms; consensus clustering;
D O I
10.1016/j.ipl.2007.06.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computing a suitable measure of consensus among several clusterings on the same data is an important problem that arises in several areas such as computational biology and data mining. In this paper, we formalize a set-theoretic model for computing such a similarity measure. Roughly speaking, in this model we have k > 1 partitions (clusters) of the same data set each containing the same number of sets and the goal is to align the sets in each partition to minimize a similarity measure. For k = 2, a polynomial-time solution was proposed by Gusfield (Information Processing Letters 82 (2002) 159-164). In this paper, we show that the problem is MAX-SNP-hard for k = 3 even if each partition in each cluster contains no more than 2 elements and provide a 2-2/k-approximation algorithm for the problem for any k. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:137 / 145
页数:9
相关论文
共 50 条
  • [31] A framework to uncover multiple alternative clusterings
    Dang, Xuan Hong
    Bailey, James
    MACHINE LEARNING, 2015, 98 (1-2) : 7 - 30
  • [32] Combining multiple clusterings by soft correspondence
    Long, B
    Zhang, ZF
    Yu, PS
    FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 282 - 289
  • [33] Multiple clusterings: Recent advances and perspectives
    Yu, Guoxian
    Ren, Liangrui
    Wang, Jun
    Domeniconi, Carlotta
    Zhang, Xiangliang
    COMPUTER SCIENCE REVIEW, 2024, 52
  • [34] Improving Supervised Learning with Multiple Clusterings
    Wemmert, Cedric
    Forestier, Germain
    Derivaux, Sebastien
    APPLICATIONS OF SUPERVISED AND UNSUPERVISED ENSEMBLE METHODS, 2009, 245 : 135 - 149
  • [35] Consensus Affinity Graph Learning for Multiple Kernel Clustering
    Ren, Zhenwen
    Yang, Simon X.
    Sun, Quansen
    Wang, Tao
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (06) : 3273 - 3284
  • [36] Heart of the Matter: discovering the consensus of multiple clustering results
    Kosorukoff, Alex
    Sinha, Saurabh
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2008, : 155 - 162
  • [37] Frequent Closed Patterns Based Multiple Consensus Clustering
    Al-Najdi, Atheer
    Pasquier, Nicolas
    Precioso, Frederic
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, (ICAISC 2016), PT II, 2016, 9693 : 14 - 26
  • [38] Multiple Kernel Clustering with Direct Consensus Graph Learning
    Wang, Yanlong
    Ren, Zhenwen
    ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING (ECC 2021), 2022, 268 : 117 - 127
  • [39] Multiple clusterings of heterogeneous information networks
    Wei, Shaowei
    Yu, Guoxian
    Wang, Jun
    Domeniconi, Carlotta
    Zhang, Xiangliang
    MACHINE LEARNING, 2021, 110 (06) : 1505 - 1526
  • [40] Are clusterings of multiple data views independent?
    Gao, Lucy L.
    Bien, Jacob
    Witten, Daniela
    BIOSTATISTICS, 2020, 21 (04) : 692 - 708