Privacy for Free: How does Dataset Condensation Help Privacy?

被引:0
|
作者
Dong, Tian [1 ,3 ]
Zhao, Bo [2 ]
Lyu, Lingjuan [3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Univ Edinburgh, Sch Informat, Edinburgh, Scotland
[3] Sony AI, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To prevent unintentional data leakage, research community has resorted to data generators that can produce differentially private data for model training. However, for the sake of the data privacy, existing solutions suffer from either expensive training cost or poor generalization performance. Therefore, we raise the question whether training efficiency and privacy can be achieved simultaneously. In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free. To demonstrate the privacy benefit of DC, we build a connection between DC and differential privacy, and theoretically prove on linear feature extractors (and then extended to non-linear feature extractors) that the existence of one sample has limited impact (O(m/n)) on the parameter distribution of networks trained on m samples synthesized from n(n >> m) raw samples by DC. We also empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks. We envision this work as a milestone for data-efficient and privacy-preserving machine learning.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Dynamic differential privacy-based dataset condensation
    Wu, Zhaoxuan
    Gao, Xiaojing
    Qian, Yongfeng
    Hao, Yixue
    Chen, Min
    [J]. NEUROCOMPUTING, 2024, 608
  • [2] Awareness about Photos on the Web and How Privacy-Privacy-Tradeoffs Could Help
    Henne, Benjamin
    Smith, Matthew
    [J]. FINANCIAL CRYPTOGRAPHY AND DATA SECURITY: FC 2013 WORKSHOPS, 2013, 7862 : 131 - 148
  • [3] Secure Dataset Condensation for Privacy-Preserving and Efficient Vertical Federated Learning
    Gao, Dashan
    Wu, Canhui
    Zhang, Xiaojin
    Yao, Xin
    Yang, Qiang
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT I, ECML PKDD 2024, 2024, 14941 : 212 - 229
  • [4] How Does LBA Influence Privacy: the Content and Process
    Li, Xiao
    [J]. NEW THINKING FOR STRATEGY: GREEN, INNOVATION AND SHARING, 2017, : 541 - 545
  • [5] Privacy monitor - Help wanted
    Dunn, P
    [J]. HOSPITALS & HEALTH NETWORKS, 2001, 75 (03): : 24 - 24
  • [6] How BlockChain Can Help Enhance The Security And Privacy in Edge Computing?
    Song, Jinyue
    Gu, Tianbo
    Mohapatra, Prasant
    [J]. 2021 ACM/IEEE 6TH SYMPOSIUM ON EDGE COMPUTING (SEC 2021), 2021, : 448 - 453
  • [7] How can IT vendors help ensure privacy and security of electronic files?
    Jordan, W
    [J]. HOSPITAL PRACTICE, 2000, 35 (11): : E12 - E12
  • [8] How can IT vendors help ensure privacy and security of electronic files?
    Jordan, W
    [J]. POSTGRADUATE MEDICINE, 2000, 108 (06) : A12 - A12
  • [9] Shifting Dataset To Preserve Data Privacy
    Pozi, Muhammad Syafiq Mohd
    Abu Bakar, Asmidar
    Ismail, Roslan
    Yussof, Salman
    Rahim, Fiza Abdul
    Ramli, Ramona
    [J]. 2018 IEEE CONFERENCE ON E-LEARNING, E-MANAGEMENT AND E-SERVICES (IC3E), 2018, : 134 - 139
  • [10] How can IT vendors help ensure privacy and security of electronic files?
    Jordan, W
    [J]. PHYSICIAN AND SPORTSMEDICINE, 2000, 28 (11): : A12 - A12