Noise-Aware Statistical Inference with Differentially Private Synthetic Data

被引:0
|
作者
Raisa, Ossi [1 ]
Jalko, Joonas [1 ]
Kaski, Samuel [2 ,3 ]
Honkela, Antti [1 ]
机构
[1] Univ Helsinki, Helsinki, Finland
[2] Aalto Univ, Espoo, Finland
[3] Univ Manchester, Manchester, England
基金
芬兰科学院;
关键词
FOUNDATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While generation of synthetic data under differential privacy (DP) has received a lot of attention in the data privacy community, analysis of synthetic data has received much less. Existing work has shown that simply analysing DP synthetic data as if it were real does not produce valid inferences of population-level quantities. For example, confidence intervals become too narrow, which we demonstrate with a simple experiment. We tackle this problem by combining synthetic data analysis techniques from the field of multiple imputation (MI), and synthetic data generation using noise-aware (NA) Bayesian modeling into a pipeline NA+MI that allows computing accurate uncertainty estimates for population-level quantities from DP synthetic data. To implement NA+MI for discrete data generation using the values of marginal queries, we develop a novel noise-aware synthetic data generation algorithm NAPSU-MQ using the principle of maximum entropy. Our experiments demonstrate that the pipeline is able to produce accurate confidence intervals from DP synthetic data. The intervals become wider with tighter privacy to accurately capture the additional uncertainty stemming from DP noise.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Mitigating Statistical Bias within Differentially Private Synthetic Data
    Ghalebikesabi, Sahra
    Wilde, Harrison
    Jewson, Jack
    Doucet, Arnaud
    Vollmer, Sebastian
    Holmes, Chris
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 696 - 705
  • [2] Data Uncertainty Guided Noise-aware Preprocessing Of Fingerprints
    Joshi, Indu
    Utkarsh, Ayush
    Kothari, Riya
    Kurmi, Vinod K.
    Dantcheva, Antitza
    Roy, Sumantra Dutta
    Kalra, Prem Kumar
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [3] A Noise-Aware Multiple Imputation Algorithm for Missing Data
    Li, Fangfang
    Sun, Hui
    Gu, Yu
    Yu, Ge
    MATHEMATICS, 2023, 11 (01)
  • [4] Noise-aware on-chip power grid considerations using a statistical approach
    Andersson, Daniel A.
    Svensson, Lars J.
    Larsson-Edefors, Per
    ISQED 2008: PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, 2008, : 663 - 669
  • [5] Noise-Aware Quantum Amplitude Estimation
    Herbert, Steven
    Williams, Ifan
    Guichard, Roland
    Ng, Darren
    IEEE TRANSACTIONS ON QUANTUM ENGINEERING, 2024, 5
  • [6] NaPer: A TSV Noise-Aware Placer
    Lee, Yu-Min
    Pan, Kuan-Te
    Chen, Chun
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (05) : 1703 - 1713
  • [7] SPATIAL NOISE-AWARE TEMPERATURE RETRIEVAL FROM INFRARED SOUNDER DATA
    Malmgren-Hansen, David
    Laparra, Valero
    Nielsen, Allan Aasbjerg
    Camps-Valls, Gustau
    2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 17 - 20
  • [8] Algorithmically Effective Differentially Private Synthetic Data
    He, Yiyun
    Vershynin, Roman
    Zhu, Yizhe
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [9] Cache Me If You Can: Accuracy-Aware Inference Engine for Differentially Private Data Exploration
    Mazmudar, Miti
    Humphries, Thomas
    Liu, Jiaxiang
    Rafuse, Matthew
    He, Xi
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 16 (04): : 574 - 586
  • [10] Does Differentially Private Synthetic Data Lead to Synthetic Discoveries?
    Perez, Ileana Montoya
    Movahedi, Parisa
    Nieminen, Valtteri
    Airola, Antti
    Pahikkala, Tapio
    METHODS OF INFORMATION IN MEDICINE, 2024, 63 (01/02) : 35 - 51