Training Generative Models From Privatized Data via Entropic Optimal Transport

被引:2
|
作者
Reshetova, Daria [1 ]
Chen, Wei-Ning [1 ]
Ozgur, Ayfer [1 ]
机构
[1] Stanford Univ, Dept Elect Engn, Stanford, CA 94205 USA
关键词
Privacy; GANs; entropic optimal transport;
D O I
10.1109/JSAIT.2024.3387463
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Local differential privacy is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data. We show that entropic regularization of optimal transport - a popular regularization method in the literature that has often been leveraged for its computational benefits - enables the generator to learn the raw (unprivatized) data distribution even though it only has access to privatized samples. We prove that at the same time this leads to fast statistical convergence at the parametric rate. This shows that entropic regularization of optimal transport uniquely enables the mitigation of both the effects of privatization noise and the curse of dimensionality in statistical convergence. We provide experimental evidence to support the efficacy of our framework in practice.
引用
收藏
页码:221 / 235
页数:15
相关论文
共 50 条
  • [41] Training Question Answering Models From Synthetic Data
    Puri, Raul
    Spring, Ryan
    Shoeybi, Mohammad
    Patwary, Mostofa
    Catanzaro, Bryan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5811 - 5826
  • [42] Reconstructing training data from document understanding models
    Dentan, Jeremie
    Paran, Arnaud
    Shabou, Aymen
    PROCEEDINGS OF THE 33RD USENIX SECURITY SYMPOSIUM, SECURITY 2024, 2024, : 6813 - 6830
  • [43] Extracting Training Data from Large Language Models
    Carlini, Nicholas
    Tramer, Florian
    Wallace, Eric
    Jagielski, Matthew
    Herbert-Voss, Ariel
    Lee, Katherine
    Roberts, Adam
    Brown, Tom
    Song, Dawn
    Erlingsson, Ulfar
    Oprea, Alina
    Raffel, Colin
    PROCEEDINGS OF THE 30TH USENIX SECURITY SYMPOSIUM, 2021, : 2633 - 2650
  • [44] From Optimal Control to Mean Field Optimal Transport via Stochastic Neural Networks
    Di Persio, Luca
    Garbelli, Matteo
    SYMMETRY-BASEL, 2023, 15 (09):
  • [45] Estimating Latent Population Flows from Aggregated Data via Inversing Multi-Marginal Optimal Transport
    Yang, Sikun
    Zha, Hongyuan
    2023 SIAM International Conference on Data Mining, SDM 2023, 2023, : 181 - 189
  • [46] Estimating Latent Population Flows from Aggregated Data via Inversing Multi-Marginal Optimal Transport
    Yang, Sikun
    Zha, Hongyuan
    PROCEEDINGS OF THE 2023 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM, 2023, : 181 - 189
  • [47] Stochastic Solutions for Simultaneous Seismic Data Denoising and Reconstruction via Score-Based Generative Models
    Meng, Chuangji
    Gao, Jinghuai
    Tian, Yajun
    Chen, Hongling
    Zhang, Wei
    Luo, Renyu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [48] Accessibility Via Public Transport Through Gravity Models Based on Open Data
    Dabagh, Shabnam
    Miristice, Lory Michelle Bresciani
    Gentile, Guido
    TRANSPORT AND TELECOMMUNICATION JOURNAL, 2024, 25 (04) : 359 - 369
  • [49] Mining Data Impressions From Deep Models as Substitute for the Unavailable Training Data
    Nayak, Gaurav Kumar
    Mopuri, Konda Reddy
    Jain, Saksham
    Chakraborty, Anirban
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) : 8465 - 8481
  • [50] Computing optimal uncertainty models from frequency domain data
    Hindi, H
    Seong, CY
    Boyd, S
    PROCEEDINGS OF THE 41ST IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 2002, : 2898 - 2905