Distributed K-Means clustering guaranteeing local differential privacy

被引:37
|
作者
Xia, Chang [1 ]
Hua, Jingyu [1 ]
Tong, Wei [1 ]
Zhong, Sheng [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
基金
国家重点研发计划;
关键词
Differential privacy; Randomized response; Machine learning; Distributed clustering; K-Means;
D O I
10.1016/j.cose.2019.101699
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many cases, a service provider might require to aggregate data from end-users to perform mining tasks such as K-means clustering. Nevertheless, since such data often contain sensitive information. In this paper, we propose the first locally differentially private K-means mechanism under this distributed scenario. Differing from standard differentially private clustering mechanisms, the proposed mechanism doesn't need any trusted third party to collect and preprocess users data. Our mechanism first perturbs users data locally to satisfy local differential privacy (LDP). Then it revises the traditional K-means algorithm to allow the service provider to obtain high-quality clustering results by collaborating with users based on the highly perturbed data. We prove that our mechanism can enable high utility clustering while guaranteeing local differential privacy for each user. We also propose an extended mechanism to improve our basic model in terms of privacy and utility. In this mechanism, we perturb both users' sensitive data and the intermediate results of users' clusters in each iteration. Moreover, we consider a more general setting where the users may have different privacy requirements. Extensive experiments are conducted on two real-world datasets, and the results show that our proposal can well preserve the quality of clustering results. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] DPLK-means: A novel Differential Privacy K-means Mechanism
    Ren, Jun
    Xiong, Jinbo
    Yao, Zhiqiang
    Ma, Rong
    Lin, Mingwei
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 133 - 139
  • [32] Guaranteeing Differential Privacy in Distributed Fusion Estimation
    Yan, Xinhao
    Chen, Bo
    Zhang, Yuchen
    Yu, Li
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2023, 59 (03) : 3416 - 3423
  • [33] A local search approximation algorithm for k-means clustering
    Kanungo, T
    Mount, DM
    Netanyahu, NS
    Piatko, CD
    Silverman, R
    Wu, AY
    COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2004, 28 (2-3): : 89 - 112
  • [34] Privacy Preserving Distributed K-Means Clustering in Malicious Model Using Verifiable Secret Sharing Scheme
    Patel, Sankita
    Sonar, Mitali
    Jinwala, Devesh C.
    INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2014, 5 (02) : 44 - 70
  • [35] GAPBAS: Genetic algorithm-based privacy budget allocation strategy in differential privacy K-means clustering algorithm
    Li, Yong
    Song, Xiao
    Tu, Yuchun
    Liu, Ming
    COMPUTERS & SECURITY, 2024, 139
  • [36] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [37] Trajectory Data Protection based on Differential Privacy k-means
    Xu, Qiyuan
    Chen, Zhenping
    Fu, Baochuan
    Shao, Xuelian
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7649 - 7654
  • [38] An Improved Differential Privacy K-means Algorithm Based on MapReduce
    Yao, Shunyuan
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2018, : 141 - 145
  • [39] A Distributed K-means Clustering Algorithm in Wireless Sensor Networks
    Zhou, Jin
    Zhang, Yuan
    Jiang, Yuyan
    Chen, C. L. Philip
    Chen, Long
    2015 INTERNATIONAL CONFERENCE ON INFORMATIVE AND CYBERNETICS FOR COMPUTATIONAL SOCIAL SYSTEMS (ICCSS), 2015, : 26 - 30
  • [40] NEW ALGORITHM FOR CLUSTERING DISTRIBUTED DATA USING K-MEANS
    Khedr, Ahmed M.
    Bhatnagar, Raj K.
    COMPUTING AND INFORMATICS, 2014, 33 (04) : 943 - 964