Set-valued Data Publication with Local Privacy: Tight Error Bounds and Efficient Mechanisms

被引:10
|
作者
Wang, Shaowei [1 ]
Qian, Yuqiu [1 ]
Du, Jiachun [1 ]
Yang, Wei [2 ]
Huang, Liusheng [2 ]
Xu, Hongli [2 ]
机构
[1] Tencent Games, Shenzhen, Peoples R China
[2] Univ Sci & Technol China, Hefei, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 08期
基金
美国国家科学基金会;
关键词
D O I
10.14778/3389133.3389140
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most user-generated data in online services are presented as set-valued data, e.g., visited website URLs, recently used Apps by a person, and etc. These data are of great value to service providers, but also bring privacy concerns if collected and analyzed directly. To tackle potential privacy threatens, local differential privacy (LDP) attracts increasing attention nowadays. However, existing approaches only provide sub-optimal error bound for set-valued data distribution estimation with LDP. Besides, it is computational expensive and communication expensive to use for high dimensional set-valued data, considering large domains in real scenarios. Thus, existing approaches are unpractical to use on resource-constrained user-side devices (e.g., smartphones and wearable devices). In this paper, we propose a utility-optimal and efficient set-valued data publication method (i.e., wheel mechanism). On the user side, each user contributes only one numerical value to represent their privatized data. The computational complexity is O(min{m log m, me(epsilon)}) and communication cost is O(log(me(epsilon))) bits, while existing approaches usually depend on O (d) or O(log d), where m is the number of items in the set-valued data (m equivalent to 1 for categorical data), d is the domain size (usually d >> m) and epsilon is the privacy budget. On the server side, the estimator takes numerical values from users as input and derives an unbiased distribution estimation. Theoretical results show that estimation error bounds are improved from previously known circle minus( m(2)d/n epsilon(2)) to the optimal rate circle minus(md/n epsilon(2)). Results on extensive experiments demonstrate that our proposed wheel mechanism is 3-100x faster than existing approaches, meanwhile has optimal statistical efficiency.
引用
收藏
页码:1234 / 1247
页数:14
相关论文
共 50 条
  • [1] PrivSet: Set-Valued Data Analyses with Local Differential Privacy
    Wang, Shaowei
    Huang, Liusheng
    Nie, Yiwen
    Wang, Pengzhan
    Xu, Hongli
    Yang, Wei
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2018), 2018, : 1070 - 1078
  • [2] A Model of Privacy Preserving in Dynamic Set-valued Data Re-publication
    Wang, Dan
    Wu, Yi
    Zhao, Wenbing
    Fu, Lihua
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2019, 20 (01): : 147 - 156
  • [3] Joint Distribution Analysis for Set-Valued Data With Local Differential Privacy
    Huang, Yaxuan
    Xue, Kaiping
    Zhu, Bin
    Wei, David S. L.
    Sun, Qibin
    Lu, Jun
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 7106 - 7117
  • [4] Set-valued data collection with local differential privacy based on category hierarchy
    Ouyang, Jia
    Xiao, Yinyin
    Liu, Shaopeng
    Xiao, Zhenghong
    Liao, Xiuxiu
    [J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (03) : 2733 - 2763
  • [5] Privacy-preserving Anonymization of Set-valued Data
    Terrovitis, Manolis
    Mamoulis, Nikos
    Kalnis, Panos
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 115 - 125
  • [6] Publishing Set-Valued Data via Differential Privacy
    Chen, Rui
    Mohammed, Noman
    Fung, Benjamin C. M.
    Desai, Bipin C.
    Xiong, Li
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (11): : 1087 - 1098
  • [7] Gap functions and global error bounds for set-valued variational inequalities
    Fan Jianghua
    Wang Xiaoguo
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2010, 233 (11) : 2956 - 2965
  • [8] Privacy Protection Method on Publishing Dynamic Set-Valued Data
    Zhang, Jian
    Yang, Yu
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, INFORMATION MANAGEMENT AND NETWORK SECURITY, 2016, 47 : 262 - 265
  • [9] Local regression smoothers with set-valued outcome data
    Li, Qiyu
    Molchanov, Ilya
    Molinari, Francesca
    Peng, Sida
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 128 : 129 - 150
  • [10] Quasi-Error Bounds for p-Convex Set-Valued Mappings
    Hui Huang
    Jiangxing Zhu
    [J]. Journal of Optimization Theory and Applications, 2023, 198 : 805 - 829