Noise-Aware Statistical Inference with Differentially Private Synthetic Data

被引:0
|
作者
Raisa, Ossi [1 ]
Jalko, Joonas [1 ]
Kaski, Samuel [2 ,3 ]
Honkela, Antti [1 ]
机构
[1] Univ Helsinki, Helsinki, Finland
[2] Aalto Univ, Espoo, Finland
[3] Univ Manchester, Manchester, England
基金
芬兰科学院;
关键词
FOUNDATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While generation of synthetic data under differential privacy (DP) has received a lot of attention in the data privacy community, analysis of synthetic data has received much less. Existing work has shown that simply analysing DP synthetic data as if it were real does not produce valid inferences of population-level quantities. For example, confidence intervals become too narrow, which we demonstrate with a simple experiment. We tackle this problem by combining synthetic data analysis techniques from the field of multiple imputation (MI), and synthetic data generation using noise-aware (NA) Bayesian modeling into a pipeline NA+MI that allows computing accurate uncertainty estimates for population-level quantities from DP synthetic data. To implement NA+MI for discrete data generation using the values of marginal queries, we develop a novel noise-aware synthetic data generation algorithm NAPSU-MQ using the principle of maximum entropy. Our experiments demonstrate that the pipeline is able to produce accurate confidence intervals from DP synthetic data. The intervals become wider with tighter privacy to accurately capture the additional uncertainty stemming from DP noise.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Social emotion classification based on noise-aware training
    Li, Xin
    Rao, Yanghui
    Xie, Haoran
    Liu, Xuebo
    Wong, Tak-Lam
    Wang, Fu Lee
    DATA & KNOWLEDGE ENGINEERING, 2019, 123
  • [42] Noise-Aware Quantum Circuit Simulation With Decision Diagrams
    Grurl, Thomas
    Fuss, Juergen
    Wille, Robert
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (03) : 860 - 873
  • [43] Noise-Aware and Lightweight LSTM for Keyword Spotting Applications
    Wang, Yingfeng
    Chong, Yi Sheng
    Goh, Wang Ling
    Anh Tuan Do
    2022 19TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC), 2022, : 135 - 136
  • [44] Noise is the fatal poison: A Noise-aware Network for noisy dataset classification
    Yu, Xiaotian
    Zhang, Shengxuming
    Jia, Lingxiang
    Wang, Yuexuan
    Song, Mingli
    Feng, Zunlei
    NEUROCOMPUTING, 2024, 563
  • [45] VARIATIONAL AUTOENCODER FOR SPEECH ENHANCEMENT WITH A NOISE-AWARE ENCODER
    Fang, Huajian
    Carbajal, Guillaume
    Wermter, Stefan
    Gerkmann, Timo
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 676 - 680
  • [46] NAN: Noise-Aware NeRFs for Burst-Denoising
    Pearl, Naama
    Treibitz, Tali
    Korman, Simon
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12662 - 12671
  • [47] A Noise-Aware Methodology for a Mobile Voice Screening Application
    Verde, Laura
    De Pietro, Giuseppe
    Veltri, Pierangelo
    Sannino, Giovanna
    2015 6TH IEEE INTERNATIONAL WORKSHOP ON ADVANCES IN SENSORS AND INTERFACES (IWASI), 2015, : 193 - 198
  • [48] Differentially Private Synthetic Data Using KD-Trees
    Kreacic, Eleonora
    Nouri, Navid
    Potluru, Vamsi K.
    Balch, Tucker
    Veloso, Manuela
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1143 - 1153
  • [49] AIM: An Adaptive and Iterative Mechanism for Differentially Private Synthetic Data
    McKenna, Ryan
    Mullins, Brett
    Sheldon, Daniel
    Miklau, Gerome
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (11): : 2599 - 2612
  • [50] Differentially Private Normalizing Flows for Synthetic Tabular Data Generation
    Lee, Jaewoo
    Kim, Minjung
    Jeong, Yonghyun
    Ro, Youngmin
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7345 - 7353