The Trade-Off Between Privacy and Fidelity via Ehrhart Theory

被引:5
|
作者
Padakandla, Arun [1 ]
Kumar, P. R. [2 ]
Szpankowski, Wojciech [3 ]
机构
[1] Univ Tennessee, Dept Elect Engn & Comp Sci, Knoxville, TN 37996 USA
[2] Texas A&M Univ, Dept Elect & Comp Engn, College Stn, TX 77843 USA
[3] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家卫生研究院;
关键词
Differential privacy; fidelity; distortion; information theory; linear programming optimization; ehrhart theory; discrete geometry; dual LP; analytic combinatorics; NOISE;
D O I
10.1109/TIT.2019.2959976
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an increasing amount of data is gathered nowadays and stored in databases, the question arises of how to protect the privacy of individual records in a database even while providing accurate answers to queries on the database. Differential Privacy (DP) has gained acceptance as a framework to quantify vulnerability of algorithms to privacy breaches. We consider the problem of how to sanitize an entire database via a DP mechanism, on which unlimited further querying is performed. While protecting privacy, it is important that the sanitized database still provide accurate responses to queries. The central contribution of this work is to characterize the amount of information preserved in an optimal DP database sanitizing mechanism (DSM). We precisely characterize the utility-privacy trade-off of mechanisms that sanitize databases in the asymptotic regime of large databases. We study this in an informationtheoretic framework by modeling a generic distribution on the data, and a measure of fidelity between the histograms of the original and sanitized databases. We consider the popular L1 -distortion metric, i.e., the total variation norm that leads to the formulation as a linear program (LP). This optimization problem is prohibitive in complexity with the number of constraints growing exponentially in the parameters of the problem. Our focus on the asymptotic regime enables us characterize precisely, the limit of the sequence of solutions to this optimization problem. Leveraging tools from discrete geometry, analytic combinatorics, and duality theorems of optimization, we fully characterize this limit in terms of a power series whose coefficients are the number of integer points on a multidimensional convex crosspolytope studied by Ehrhart in 1967. Employing Ehrhart theory, we determine a simple closed form computable expression for the asymptotic growth of the optimal privacy-fidelity trade-off to infinite precision. At the heart of the findings is a deep connection between the minimum expected distortion and a fundamental construct in Ehrhart theory - Ehrhart series of an integral convex polytope.
引用
收藏
页码:2549 / 2569
页数:21
相关论文
共 50 条