Distributed K-Means clustering guaranteeing local differential privacy

被引:37
|
作者
Xia, Chang [1 ]
Hua, Jingyu [1 ]
Tong, Wei [1 ]
Zhong, Sheng [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
基金
国家重点研发计划;
关键词
Differential privacy; Randomized response; Machine learning; Distributed clustering; K-Means;
D O I
10.1016/j.cose.2019.101699
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many cases, a service provider might require to aggregate data from end-users to perform mining tasks such as K-means clustering. Nevertheless, since such data often contain sensitive information. In this paper, we propose the first locally differentially private K-means mechanism under this distributed scenario. Differing from standard differentially private clustering mechanisms, the proposed mechanism doesn't need any trusted third party to collect and preprocess users data. Our mechanism first perturbs users data locally to satisfy local differential privacy (LDP). Then it revises the traditional K-means algorithm to allow the service provider to obtain high-quality clustering results by collaborating with users based on the highly perturbed data. We prove that our mechanism can enable high utility clustering while guaranteeing local differential privacy for each user. We also propose an extended mechanism to improve our basic model in terms of privacy and utility. In this mechanism, we perturb both users' sensitive data and the intermediate results of users' clusters in each iteration. Moreover, we consider a more general setting where the users may have different privacy requirements. Extensive experiments are conducted on two real-world datasets, and the results show that our proposal can well preserve the quality of clustering results. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] On the Security of Distributed Multi-Agent K-Means Clustering With Local Differential Privacy
    Shi, Congcong
    Huang, Xiuli
    Yu, Pengfei
    IEEE ACCESS, 2024, 12 : 124751 - 124763
  • [2] K-Means Clustering with Local Distance Privacy
    Yang, Mengmeng
    Huang, Longxia
    Tang, Chenghua
    BIG DATA MINING AND ANALYTICS, 2023, 6 (04) : 433 - 442
  • [3] Distributed threshold k-means clustering for privacy preserving data mining
    Baby, Vadlana
    Chandra, N. Subhash
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 2286 - 2289
  • [4] K-Means Clustering With Local dχ-Privacy for Privacy-Preserving Data Analysis
    Yang, Mengmeng
    Tjuawinata, Ivan
    Lam, Kwok-Yan
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2022, 17 : 2524 - 2537
  • [5] K-Means Clustering with Distributed Dimensions
    Ding, Hu
    Liu, Yu
    Huang, Lingxiao
    Li, Jian
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [6] Efficient privacy-preserving outsourced k-means clustering on distributed data
    Qiu, Guowei
    Zhao, Yingliang
    Gui, Xiaolin
    INFORMATION SCIENCES, 2024, 674
  • [7] An Efficient Approach for Privacy Preserving Distributed K-Means Clustering in Unsecured Environment
    Shewale, Amit
    Keshavamurthy, B. N.
    Modi, Chirag N.
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 1, 2019, 707 : 425 - 431
  • [8] Privacy Preserving Distributed Cell-based K-means Clustering Algorithm
    Su, Fang
    Zu, Yun-xiao
    Li, Wei-hai
    INTERNATIONAL CONFERENCE ON MATHEMATICS, MODELLING AND SIMULATION TECHNOLOGIES AND APPLICATIONS (MMSTA 2017), 2017, 215 : 377 - 383
  • [9] Privacy Preserving Approximate K-means Clustering
    Biswas, Chandan
    Ganguly, Debasis
    Roy, Dwaipayan
    Bhattacharya, Ujjwal
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 1321 - 1330
  • [10] Efficient Privacy Preserving K-Means Clustering
    Upmanyu, Maneesh
    Namboodiri, Anoop M.
    Srinathan, Kannan
    Jawahar, C. V.
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2010, 6122 : 154 - 166