GAPBAS: Genetic algorithm-based privacy budget allocation strategy in differential privacy K-means clustering algorithm

被引:9
|
作者
Li, Yong [1 ]
Song, Xiao [1 ]
Tu, Yuchun [1 ]
Liu, Ming [1 ]
机构
[1] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100019, Peoples R China
基金
北京市自然科学基金;
关键词
Differential privacy; Genetic algorithm; Privacy budget allocation; Combinatorial optimization problem; K -means clustering algorithm;
D O I
10.1016/j.cose.2023.103697
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The differential privacy k-means (DP k-means) clustering algorithm emerged to address the privacy protection challenges in the field of data mining. However, the algorithm encounters difficulties in achieving clustering usability and convergence. Privacy budget (epsilon), a critical parameter determining the noise addition in differential privacy algorithms, garners significant attention. Consequently, researchers have shifted their focus to studying privacy budget allocation strategies within the DP k-means clustering algorithm. However, the selection of a privacy budget allocation strategy in the DP k-means algorithm is an NP-hard problem. Our initial intuition is that genetic algorithms can efficiently discover relatively optimal privacy budget sequences. In this context, we propose a genetic algorithm-based privacy budget allocation strategy (GAPBAS) to ensure the convergence and usability of the DP k-means algorithm. Firstly, convergence is ensured by selecting improved initial centroids and rigorously controlling the minimum privacy budget for the DP k-means algorithm. Additionally, the privacy budget allocation strategy of the DP k-means algorithm is reformulated as a combinatorial optimization problem. This entails merging privacy budgets from multiple iterative rounds into a sequential sequence and utilizing a genetic algorithm to select the optimal privacy budget allocation strategy, thereby significantly enhancing the usability of the DP k-means algorithm. Comparative experiments against the other four privacy budget allocation strategies in the DP k-means algorithm demonstrate the superior performance of the genetic algorithm-based privacy budget allocation strategy at the same level of privacy protection.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A differential privacy protecting K-means clustering algorithm based on contour coefficients
    Zhang, Yaling
    Liu, Na
    Wang, Shangping
    PLOS ONE, 2018, 13 (11):
  • [2] Outlier-eliminated k-means clustering algorithm based on differential privacy preservation
    Qingying Yu
    Yonglong Luo
    Chuanming Chen
    Xintao Ding
    Applied Intelligence, 2016, 45 : 1179 - 1191
  • [3] An Improved Differential Privacy K-means Algorithm Based on MapReduce
    Yao, Shunyuan
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2018, : 141 - 145
  • [4] Outlier-eliminated k-means clustering algorithm based on differential privacy preservation
    Yu, Qingying
    Luo, Yonglong
    Chen, Chuanming
    Ding, Xintao
    APPLIED INTELLIGENCE, 2016, 45 (04) : 1179 - 1191
  • [5] Privacy Preserving Distributed Cell-based K-means Clustering Algorithm
    Su, Fang
    Zu, Yun-xiao
    Li, Wei-hai
    INTERNATIONAL CONFERENCE ON MATHEMATICS, MODELLING AND SIMULATION TECHNOLOGIES AND APPLICATIONS (MMSTA 2017), 2017, 215 : 377 - 383
  • [6] A reversible privacy-preserving clustering technique based on k-means algorithm
    Lin, Chen-Yi
    APPLIED SOFT COMPUTING, 2020, 87
  • [7] RETRACTED: CVDP k-means clustering algorithm for differential privacy based on coefficient of variation (Retracted Article)
    Kong, Yuting
    Qian, Yurong
    Tan, Fuxiang
    Bai, Lu
    Shao, Jinxin
    Ma, Tinghuai
    Tereshchenko, Sergei Nikolayevich
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (05) : 6027 - 6045
  • [8] On K-means Data Clustering Algorithm with Genetic Algorithm
    Kapil, Shruti
    Chawla, Meenu
    Ansari, Mohd Dilshad
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 202 - 206
  • [9] A K-means Based Genetic Algorithm for Data Clustering
    Pizzuti, Clara
    Procopio, Nicola
    INTERNATIONAL JOINT CONFERENCE SOCO'16- CISIS'16-ICEUTE'16, 2017, 527 : 211 - 222
  • [10] A K-means Optimized Clustering Algorithm Based on Improved Genetic Algorithm
    Pu, Qiu-Mei
    Wu, Qiong
    Li, Qian
    Lecture Notes in Electrical Engineering, 2022, 801 LNEE : 133 - 140