Correlation Clustering with Noisy Input

被引:0
|
作者
Mathieu, Claire [1 ]
Schudy, Warren [1 ]
机构
[1] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Correlation clustering is a type of clustering that uses a. basic form of input data For every pair of data items, the input specifies whether they ale similar (belonging to the same cluster) or dissimilar (belonging to different clusters) This lamination may be inconsistent, and the goal is to find a clustering (partition of the vertices) that. disagrees with as few pieces of information as possible Colleration clustering is APX-hard for worst-case inputs We study the following semi-random noisy model to generate the input stall, from an arbitrary partition of the vertices into clusters. Then; for each pair of vertices, the similarity information is corrupted (noisy) independently with probability p Finally, an adversary generates the Input by choosing similality/dissimilarity information arbitrarily for each corrupted pair of vertices In this model, out algorithm produces a. clustering with cost at most 1 + O(n(-1/6)) tones the cost of the optimal clustering, as long as p <= 1/2 71- n(-1/3) Moreover, if all clusters have size at least(1) c(1)root n then we can exactly reconstruct the planted clustering If the noise p is small, that p <= n(-delta)/60, then we can exactly reconstruct all clusters of the planted clustering that have size at least 3150/delta, and provide a certificate (witness) proving that those clusters file in any optimal clustering Among other techniques, we use the natural semi-definite programming relaxation followed by an ink-nesting rounding phase The analysis uses SDP duality and spectral properties of random mattices.
引用
收藏
页码:712 / 728
页数:17
相关论文
共 50 条
  • [21] Correlation Clustering
    Nikhil Bansal
    Avrim Blum
    Shuchi Chawla
    Machine Learning, 2004, 56 : 89 - 113
  • [22] Correlation clustering
    Bansal, N
    Blum, A
    Chawla, S
    FOCS 2002: 43RD ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2002, : 238 - 247
  • [23] Linear dynamic filtering with noisy input and output
    Markovsky, I
    De Moor, B
    AUTOMATICA, 2005, 41 (01) : 167 - 171
  • [24] Top-m Clustering with a Noisy Oracle
    Choudhury, Tuhinangshu
    Shah, Dhruti
    Karamchandani, Nikhil
    2019 25TH NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2019,
  • [25] Top-k and Clustering with Noisy Comparisons
    Davidson, Susan
    Khanna, Sanjeev
    Milo, Tova
    Roy, Sudeepa
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2014, 39 (04):
  • [26] Optimal clustering from noisy binary feedback
    Ariu, Kaito
    Ok, Jungseul
    Proutiere, Alexandre
    Yun, Seyoung
    MACHINE LEARNING, 2024, 113 (05) : 2733 - 2764
  • [27] Noisy Subspace Clustering via Matching Pursuits
    Tschannen, Michael
    Boelcskei, Helmut
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (06) : 4081 - 4104
  • [28] Adaptive Wavelet Clustering for Highly Noisy Data
    Chen, Zengjian
    Liu, Jiayi
    Deng, Yihe
    He, Kun
    Hopcroft, John E.
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 328 - 337
  • [29] Unsupervised data pruning for clustering of noisy data
    Hong, Yi
    Kwong, Sam
    Chang, Yuchou
    Ren, Qingsheng
    KNOWLEDGE-BASED SYSTEMS, 2008, 21 (07) : 612 - 616
  • [30] Comparative Analysis of Noisy Time Series Clustering
    Kirichenko, Lyudmyla
    Radivilova, Tamara
    Tkachenko, Anastasiia
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT SYSTEMS (COLINS-2019), VOL I: MAIN CONFERENCE, 2019, 2362 : 184 - 196