Noise-Aware Statistical Inference with Differentially Private Synthetic Data

被引:0
|
作者
Raisa, Ossi [1 ]
Jalko, Joonas [1 ]
Kaski, Samuel [2 ,3 ]
Honkela, Antti [1 ]
机构
[1] Univ Helsinki, Helsinki, Finland
[2] Aalto Univ, Espoo, Finland
[3] Univ Manchester, Manchester, England
基金
芬兰科学院;
关键词
FOUNDATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While generation of synthetic data under differential privacy (DP) has received a lot of attention in the data privacy community, analysis of synthetic data has received much less. Existing work has shown that simply analysing DP synthetic data as if it were real does not produce valid inferences of population-level quantities. For example, confidence intervals become too narrow, which we demonstrate with a simple experiment. We tackle this problem by combining synthetic data analysis techniques from the field of multiple imputation (MI), and synthetic data generation using noise-aware (NA) Bayesian modeling into a pipeline NA+MI that allows computing accurate uncertainty estimates for population-level quantities from DP synthetic data. To implement NA+MI for discrete data generation using the values of marginal queries, we develop a novel noise-aware synthetic data generation algorithm NAPSU-MQ using the principle of maximum entropy. Our experiments demonstrate that the pipeline is able to produce accurate confidence intervals from DP synthetic data. The intervals become wider with tighter privacy to accurately capture the additional uncertainty stemming from DP noise.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Private Sampling: A Noiseless Approach for Generating Differentially Private Synthetic Data
    Boedihardjo, March
    Strohmer, Thomas
    Vershynin, Roman
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2022, 4 (03): : 1082 - 1115
  • [22] Noise-Aware Network Embedding for Multiplex Network
    Chu, Xiaokai
    Fan, Xinxin
    Yao, Di
    Zhang, Chen-Lin
    Huang, Jianhui
    Bi, Jingping
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [23] Noise-aware driver Modeling for nanometer technology
    Bai, XL
    Chandra, R
    Dey, S
    Srinivas, PV
    4TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, PROCEEDINGS, 2003, : 177 - 182
  • [24] Inherent Noise-Aware Insect Swarm Simulation
    Wang, Xinjie
    Jin, Xiaogang
    Deng, Zhigang
    Zhou, Linling
    COMPUTER GRAPHICS FORUM, 2014, 33 (06) : 51 - 62
  • [25] Ensemble Clustering by Noise-Aware Graph Decomposition
    Chen, Mansheng
    Huang, Dong
    He, Mingkai
    Wang, Chang-Dong
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: IOT AND SMART CITY (ICIT 2018), 2018, : 184 - 188
  • [26] Noise-aware image deconvolution with multidirectional filters
    Yang, Hang
    Zhu, Ming
    Huang, Heyan
    Zhang, Zhongbo
    APPLIED OPTICS, 2013, 52 (27) : 6792 - 6798
  • [27] FANATIC: FAst Noise-Aware TopIc Clustering
    Silburt, Ari
    Subasic, Anja
    Thompson, Evan
    Dsilva, Carmeline
    Fares, Tarec
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 650 - 663
  • [28] Differentially Private Network Data Release via Structural Inference
    Xiao, Qian
    Chen, Rui
    Tan, Kian-Lee
    PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 911 - 920
  • [29] KAMINO: Constraint-Aware Differentially Private Data Synthesis
    Ge, Chang
    Mohapatra, Shubhankar
    He, Xi
    Ilyas, Ihab F.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (10): : 1886 - 1899
  • [30] Differentially private and utility-aware publication of trajectory data
    Liu, Qi
    Yu, Juan
    Han, Jianmin
    Yao, Xin
    Expert Systems with Applications, 2021, 180