Differentially Private Clustering in High-Dimensional Euclidean Spaces

被引:0
|
作者
Balcan, Maria-Florina [1 ]
Dick, Travis [1 ]
Liang, Yingyu [2 ]
Mou, Wenlong [3 ]
Zhang, Hongyang [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Peking Univ, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of clustering sensitive data while preserving the privacy of individuals represented in the dataset, which has broad applications in practical machine learning and data analysis tasks. Although the problem has been widely studied in the context of low-dimensional, discrete spaces, much remains unknown concerning private clustering in high-dimensional Euclidean spaces R-d. In this work, we give differentially private and efficient algorithms achieving strong guarantees for k-means and k-median clustering when d = Omega( polylog(n)). Our algorithm achieves clustering loss at most log(3) (n) OPT + poly(log n, d, k), advancing the state-of-the-art result of root dOPT+ poly(log n, d(d), k(d)). We also study the case where the data points are s-sparse and show that the clustering loss can scale logarithmically with d, i.e., log(3) (n)OPT poly(log n, log d, k, s). Experiments on both synthetic and real datasets verify the effectiveness of the proposed method.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Packing hyperspheres in high-dimensional Euclidean spaces
    Skoge, Monica
    Donev, Aleksandar
    Stillinger, Frank H.
    Torquato, Salvatore
    [J]. PHYSICAL REVIEW E, 2006, 74 (04)
  • [2] Clustering in high-dimensional data spaces
    Murtagh, FD
    [J]. STATISTICAL CHALLENGES IN ASTRONOMY, 2003, : 279 - 292
  • [3] On stochastic generation of ultrametrics in high-dimensional Euclidean spaces
    Zubarev A.P.
    [J]. P-Adic Numbers, Ultrametric Analysis, and Applications, 2014, 6 (2) : 155 - 165
  • [4] Comment on "Packing hyperspheres in high-dimensional Euclidean spaces"
    Zamponi, Francesco
    [J]. PHYSICAL REVIEW E, 2007, 75 (04):
  • [5] Locally differentially private high-dimensional data synthesis
    Chen, Xue
    Wang, Cheng
    Yang, Qing
    Hu, Teng
    Jiang, Changjun
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (01)
  • [6] Locally differentially private high-dimensional data synthesis
    Xue Chen
    Cheng Wang
    Qing Yang
    Teng Hu
    Changjun Jiang
    [J]. Science China Information Sciences, 2023, 66
  • [7] Locally differentially private high-dimensional data synthesis
    Xue CHEN
    Cheng WANG
    Qing YANG
    Teng HU
    Changjun JIANG
    [J]. Science China(Information Sciences), 2023, 66 (01) : 25 - 42
  • [8] Weak curvatures of irregular curves in high-dimensional Euclidean spaces
    Mucci, Domenico
    Saracco, Alberto
    [J]. ANNALS OF GLOBAL ANALYSIS AND GEOMETRY, 2021, 60 (02) : 181 - 216
  • [9] Homology of moduli spaces of linkages in high-dimensional Euclidean space
    Schuetz, Dirk
    [J]. ALGEBRAIC AND GEOMETRIC TOPOLOGY, 2013, 13 (02): : 1183 - 1224
  • [10] Weak curvatures of irregular curves in high-dimensional Euclidean spaces
    Domenico Mucci
    Alberto Saracco
    [J]. Annals of Global Analysis and Geometry, 2021, 60 : 181 - 216