DPXPlain: Privately Explaining Aggregate Query Answers

被引:3
|
作者
Tao, Yuchao [1 ]
Gilad, Amir [1 ]
Machanavajjhala, Ashwin [1 ]
Roy, Sudeepa [1 ]
机构
[1] Duke Univ, Durham, NC 27708 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 16卷 / 01期
关键词
DIFFERENTIAL PRIVACY; PROVENANCE; SECURE;
D O I
10.14778/3561261.3561271
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Differential privacy (DP) is the state-of-the-art and rigorous notion of privacy for answering aggregate database queries while preserving the privacy of sensitive information in the data. In today's era of data analysis, however, it poses new challenges for users to understand the trends and anomalies observed in the query results: Is the unexpected answer due to the data itself, or is it due to the extra noise that must be added to preserve DP? In the second case, even the observation made by the users on query results may be wrong. In the first case, can we still mine interesting explanations from the sensitive data while protecting its privacy? To address these challenges, we present a three-phase framework DPXPLAIN, which is the first system to the best of our knowledge for explaining group-by aggregate query answers with DP. In its three phases, DPXPLAIN (a) answers a group-by aggregate query with DP, (b) allows users to compare aggregate values of two groups and with high probability assesses whether this comparison holds or is flipped by the DP noise, and (c) eventually provides an explanation table containing the approximately 'top-k' explanation predicates along with their relative influences and ranks in the form of confidence intervals, while guaranteeing DP in all steps. We perform an extensive experimental analysis of DPXPLAIN with multiple use-cases on real and synthetic data showing that DPXPLAIN efficiently provides insightful explanations with good accuracy and utility.
引用
收藏
页码:113 / 126
页数:14
相关论文
共 50 条
  • [41] Technical Perspective: Query Answers - Fewer is Faster
    Libkin, Leonid
    SIGMOD RECORD, 2023, 52 (01) : 63 - 63
  • [42] Linguistically characterizing clusters of database query answers
    Moreau, Aurelien
    Pivert, Olivier
    Smits, Gregory
    FUZZY SETS AND SYSTEMS, 2019, 366 : 18 - 33
  • [43] Explanations for query answers under existential rules
    Ceylan, Ismail Ilkan
    Lukasiewicz, Thomas
    Malizia, Enrico
    Vaicenavicius, Andrius
    ARTIFICIAL INTELLIGENCE, 2025, 341
  • [44] Explanations for Query Answers under Existential Rules
    Ceylan, Ismail Ilkan
    Lukasiewicz, Thomas
    Malizia, Enrico
    Vaicenavicius, Andrius
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 1639 - 1646
  • [45] Consistent query answers in the presence of universal constraints
    Staworko, Slawomir
    Chomicki, Jan
    INFORMATION SYSTEMS, 2010, 35 (01) : 1 - 22
  • [46] Explaining answers generated by knowledge graph embeddings
    Ruschel, Andrey
    Gusmao, Arthur Colombini
    Cozman, Fabio Gagliardi
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2024, 171
  • [47] A Semantic Matrix for Aggregate Query Rewriting
    Perriot, Romain
    d'Orazio, Laurent
    Laurent, Dominique
    Spyratos, Nicolas
    INFORMATION SEARCH, INTEGRATION, AND PERSONALIZATION, (ISIP 2015), 2016, 622 : 46 - 66
  • [48] Aggregate operators in constraint query languages
    Benedikt, M
    Libkin, L
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2002, 64 (03) : 628 - 654
  • [49] Aggregate Query Processing on Incomplete Data
    Zhang, Anzhen
    Wang, Jinbao
    Li, Jianzhong
    Gao, Hong
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 286 - 294
  • [50] Range-Consistent Answers of Aggregate Queries under Aggregate Constraints
    Flesca, Sergio
    Furfaro, Filippo
    Parisi, Francesco
    SCALABLE UNCERTAINTY MANAGEMENT, SUM 2010, 2010, 6379 : 163 - 176