Differentially Private K-Means Publishing with Distributed Dimensions

被引:0
|
作者
Zhu, Boyu [1 ]
Zhang, Yuan [1 ]
Chen, Tingting [2 ]
Zhong, Sheng [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Comp Sci & Technol Dept, Nanjing, Peoples R China
[2] Calif State Polytech Univ Pomona, Dept Comp Sci, Coll Sci, Pomona, CA USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
D O I
10.1109/CSCWD61410.2024.10580021
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we address the critical concerns related to dataset privacy in the context of k-means clustering publishing within a distributed dimension setting. By leveraging differential privacy mechanisms, we propose a novel framework that integrates a differentially private classifier, constructed through voting based on raw clustering results, and an enhanced generative adversarial network (GAN) simulating the classifier's behavior in inferring class labels for a public dataset. Our approach generates synthetic clustering results that mimic real outcomes in classification tasks, ensuring differential privacy and minimizing noise. Our contributions include a comprehensive exploration of privacy issues, the introduction of a novel privacy-preserving k-means clustering framework, and theoretical analyses demonstrating sensitivity and differential privacy guarantees. Evaluation on the MNIST dataset demonstrates the effectiveness of the framework, achieving 82.22% accuracy with a (10.48, 10-9)-differential-privacy guarantee, compared to 83.45% accuracy without privacy-preserving.
引用
收藏
页码:3263 / 3268
页数:6
相关论文
共 50 条
  • [41] Research on distributed genetic k-means for anomaly detection in MANET
    Department of Information Security, Naval University of Engineering, Wuhan
    430033, China
    Tongxin Xuebao, 11
  • [42] Undersampled K-means approach for handling imbalanced distributed data
    Kumar, N. Santhosh
    Rao, K. Nageswara
    Govardhan, A.
    Reddy, K. Sudheer
    Mahmood, Ali Mirza
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2014, 3 (01) : 29 - 38
  • [43] NEW ALGORITHM FOR CLUSTERING DISTRIBUTED DATA USING K-MEANS
    Khedr, Ahmed M.
    Bhatnagar, Raj K.
    COMPUTING AND INFORMATICS, 2014, 33 (04) : 943 - 964
  • [44] Approach for distributed BPEL engine placement using K-means
    Lin, R.-H., 1600, Editorial Board of Journal on Communications (35):
  • [45] Distributed Sparse Subspace Clustering by K-Means Subspace Fusion
    Huang, Liang-Chi
    Hong, Y. -W. Peter
    Wu, Jwo-Yuh
    2024 IEEE 13RD SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP, SAM 2024, 2024,
  • [46] Distributed and Provably Good Seedings for k-Means in Constant Rounds
    Bachem, Olivier
    Lucic, Mario
    Krause, Andreas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [47] An Optimal Distributed K-Means Clustering Algorithm Based on CloudStack
    Mao, Yingchi
    Xu, Ziyang
    Li, Xiaofang
    Ping, Ping
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 3149 - 3156
  • [48] An Optimal Distributed K-Means Clustering Algorithm Based on CloudStack
    Mao, Yingchi
    Xu, Ziyang
    Ping, Ping
    Wang, Longbao
    2015 NINTH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY FCST 2015, 2015, : 386 - 391
  • [49] Distributed K-Means algorithm based on a Spark optimization sample
    Feng, Yongan
    Zou, Jiapeng
    Liu, Wanjun
    Lv, Fu
    PLOS ONE, 2024, 19 (12):
  • [50] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176