Statistical Data Privacy: A Song of Privacy and Utility

被引:5
|
作者
Slavkovic, Aleksandra [1 ]
Seeman, Jeremy [1 ]
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
statistical data privacy; statistical disclosure control; formal privacy; differential privacy; inference; DIFFERENTIAL PRIVACY; RISK; ATTACKS; MODELS; NOISE;
D O I
10.1146/annurev-statistics-033121-112921
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
To quantify trade-offs between increasing demand for open data sharing and concerns about sensitive information disclosure, statistical data privacy (SDP) methodology analyzes data release mechanisms that sanitize outputs based on confidential data. Two dominant frameworks exist: statistical disclosure control (SDC) and the more recent differential privacy (DP). Despite framing differences, both SDC and DP share the same statistical problems at their core. For inference problems, either we may design optimal release mechanisms and associated estimators that satisfy bounds on disclosure risk measures, or we may adjust existing sanitized output to create new statistically valid and optimal estimators. Regardless of design or adjustment, in evaluating risk and utility, valid statistical inferences from mechanism outputs require uncertainty quantification that accounts for the effect of the sanitization mechanism that introduces bias and/or variance. In this review, we discuss the statistical foundations common to both SDC and DP, highlight major developments in SDP, and present exciting open research problems in private inference.
引用
收藏
页码:189 / 218
页数:30
相关论文
共 50 条
  • [1] Towards an Axiomatization of Statistical Privacy and Utility
    Kifer, Daniel
    Lin, Bing-Rong
    [J]. PODS 2010: PROCEEDINGS OF THE TWENTY-NINTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2010, : 147 - 158
  • [2] Privacy preservation of the user data and properly balancing between privacy and utility
    Yuvaraj N.
    Praghash K.
    Karthikeyan T.
    [J]. International Journal of Business Intelligence and Data Mining, 2022, 20 (04): : 394 - 411
  • [3] A Theory of Utility and Privacy of Data Sources
    Sankar, Lalitha
    Rajagopalan, S. Raj
    Poor, H. Vincent
    [J]. 2010 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2010, : 2642 - 2646
  • [4] Privacy of Synthetic Data: A Statistical Framework
    Boedihardjo, March
    Strohmer, Thomas
    Vershynin, Roman
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (01) : 520 - 527
  • [5] Designing Statistical Privacy for Your Data
    Machanavajjhala, Ashwin
    Kifer, Daniel
    [J]. COMMUNICATIONS OF THE ACM, 2015, 58 (03) : 58 - 67
  • [6] Privacy-Utility Tradeoff under Statistical Uncertainty
    Makhdoumi, Ali
    Fawaz, Nadia
    [J]. 2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 1627 - 1634
  • [7] Privacy-Utility Feature Selection as a Privacy Mechanism in Collaborative Data Classification
    Sheikhalishahi, Mina
    Martinelli, Fabio
    [J]. 2017 IEEE 26TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES - INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2017, : 244 - 249
  • [9] Coupled-Worlds Privacy: Exploiting Adversarial Uncertainty in Statistical Data Privacy
    Bassily, Raef
    Groce, Adam
    Katz, Jonathan
    Smith, Adam
    [J]. 2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 439 - 448
  • [10] Optimizing Privacy and Data Utility: Metrics and Strategies
    Mauger, Clemence
    Le Mahec, Gael
    Dequen, Gilles
    [J]. TRANSACTIONS ON DATA PRIVACY, 2023, 16 (03) : 153 - 189