Correlation clustering

被引：0

作者：

Bansal, N ^{[1
]}

Blum, A ^{[1
]}

Chawla, S ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

来源：

FOCS 2002: 43RD ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS | 2002年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We consider the following clustering problem: we have a complete graph on n vertices (items), where each edge (u, v) is labeled either + or - depending on whether u and v have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of + edges within clusters, plus the number of - edges between clusters (equivalently, minimizes the number of disagreements: the number of - edges inside clusters plus the number of + edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function f learned from past data, and the goal is to partition the current set of documents in a way that correlates with f as much as possible; it can also be viewed as a kind of "agnostic learning" problem. An interesting feature of this clustering formulation is that one does not need to specify the number of clusters k as a separate parameter as in measures such as k-median or min-sum or min-max clustering. Instead, in our formulation, the optimal number of clusters could be any value between l and n, depending on the edge labels. We look at approximation algorithms for both minimizing disagreements and for maximizing agreements. For minimizing disagreements, we give a constant factor approximation. For maximizing agreements we give a PTAS. We also show how to extend some of these results to graphs with edge labels in [-1, +1], and give some results for the case of random noise.

引用

页码：238 / 247

页数：10

共 50 条

[1] Correlation clustering and consensus clustering
Bonizzoni, P
Della Vedova, G
Dondi, R
Jiang, T
ALGORITHMS AND COMPUTATION, 2005, 3827 : 226 - 235
[2] Correlation clustering
Bansal, N
Blum, A
Chawla, S
MACHINE LEARNING, 2004, 56 (1-3) : 89 - 113
[3] Correlation Clustering
Nikhil Bansal
Avrim Blum
Shuchi Chawla
Machine Learning, 2004, 56 : 89 - 113
[4] On the approximation of correlation clustering and consensus clustering
Bonizzoni, Paola
Della Vedova, Gianluca
Dondi, Riccardo
Jiang, Tao
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2008, 74 (05) : 671 - 696
[5] Rough Clustering Generated by Correlation Clustering
Aszalos, Laszlo
Mihalydeak, Tamas
ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, 2013, 8170 : 315 - 324
[6] LUCKe - Connecting Clustering and Correlation Clustering
Beer, Anna
Stephan, Lisa
Seidl, Thomas
21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 431 - 440
[7] ONLINE CORRELATION CLUSTERING
Mathieu, Claire
Sankur, Ocan
Schudy, Warren
27TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2010), 2010, 5 : 573 - 583
[8] Interactive Correlation Clustering
Geerts, Floris
Ndindi, Reuben
2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2014, : 170 - 176
[9] Overlapping correlation clustering
Francesco Bonchi
Aristides Gionis
Antti Ukkonen
Knowledge and Information Systems, 2013, 35 : 1 - 32
[10] Overlapping correlation clustering
Bonchi, Francesco
Gionis, Aristides
Ukkonen, Antti
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (01) : 1 - 32

← 1 2 3 4 5 →