Distributed K-Means clustering guaranteeing local differential privacy

被引:37
|
作者
Xia, Chang [1 ]
Hua, Jingyu [1 ]
Tong, Wei [1 ]
Zhong, Sheng [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
基金
国家重点研发计划;
关键词
Differential privacy; Randomized response; Machine learning; Distributed clustering; K-Means;
D O I
10.1016/j.cose.2019.101699
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many cases, a service provider might require to aggregate data from end-users to perform mining tasks such as K-means clustering. Nevertheless, since such data often contain sensitive information. In this paper, we propose the first locally differentially private K-means mechanism under this distributed scenario. Differing from standard differentially private clustering mechanisms, the proposed mechanism doesn't need any trusted third party to collect and preprocess users data. Our mechanism first perturbs users data locally to satisfy local differential privacy (LDP). Then it revises the traditional K-means algorithm to allow the service provider to obtain high-quality clustering results by collaborating with users based on the highly perturbed data. We prove that our mechanism can enable high utility clustering while guaranteeing local differential privacy for each user. We also propose an extended mechanism to improve our basic model in terms of privacy and utility. In this mechanism, we perturb both users' sensitive data and the intermediate results of users' clusters in each iteration. Moreover, we consider a more general setting where the users may have different privacy requirements. Extensive experiments are conducted on two real-world datasets, and the results show that our proposal can well preserve the quality of clustering results. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Outlier-eliminated k-means clustering algorithm based on differential privacy preservation
    Qingying Yu
    Yonglong Luo
    Chuanming Chen
    Xintao Ding
    Applied Intelligence, 2016, 45 : 1179 - 1191
  • [22] Outlier-eliminated k-means clustering algorithm based on differential privacy preservation
    Yu, Qingying
    Luo, Yonglong
    Chen, Chuanming
    Ding, Xintao
    APPLIED INTELLIGENCE, 2016, 45 (04) : 1179 - 1191
  • [23] Privacy-Preserving k-Means Clustering under Multiowner Setting in Distributed Cloud Environments
    Rong, Hong
    Wang, Huimei
    Liu, Jian
    Hao, Jialu
    Xian, Ming
    SECURITY AND COMMUNICATION NETWORKS, 2017,
  • [24] Distributed k-means Clustering with Low Transmission Cost
    Naldi, Murilo Coelho
    Gabrielli Barreto Campello, Ricardo Jose
    2013 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2013, : 70 - 75
  • [25] A distributed framework for trimmed Kernel k-Means clustering
    Tsapanos, Nikolaos
    Tefas, Anastasios
    Nikolaidis, Nikolaos
    Pitas, Ioannis
    PATTERN RECOGNITION, 2015, 48 (08) : 2685 - 2698
  • [26] Comparison of distributed evolutionary k-means clustering algorithms
    Naldi, M. C.
    Campello, R. J. G. B.
    NEUROCOMPUTING, 2015, 163 : 78 - 93
  • [27] Private Distributed K-Means Clustering on Interval Data
    Huang, Dingquan
    Yao, Xin
    An, Senquan
    Ren, Shengbing
    2021 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE (IPCCC), 2021,
  • [28] Privacy of outsourced two-party k-means clustering
    Cai, Yunlu
    Tang, Chunming
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (08):
  • [29] Privacy-preserving k-means clustering with local synchronization in peer-to-peer networks
    Zhu, Youwen
    Li, Xingxin
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2020, 13 (06) : 2272 - 2284
  • [30] Privacy-preserving k-means clustering with local synchronization in peer-to-peer networks
    Youwen Zhu
    Xingxin Li
    Peer-to-Peer Networking and Applications, 2020, 13 : 2272 - 2284