You Are What You Buy: Personal Information Extraction From Anonymized Data

被引:0
|
作者
Cilloni, Thomas [1 ]
Fleming, Charles [2 ]
Walter, Charles [1 ]
机构
[1] Univ Mississippi, Dept Comp & Informat Sci, University, MS 38677 USA
[2] Cisco, San Jose, CA 95134 USA
关键词
Data anonymization; machine learning; privacy; responsible AI;
D O I
10.1109/ACCESS.2024.3365190
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The exponential growth of data in the information age poses several threats to the privacy and safety of digital service users. Existing legislation, such as the GDPR in Europe and the CCPA in California, defines frameworks and guidelines to promote personal privacy but leaves freedom in the choice of means to achieve privacy. Data anonymization techniques remove information that can be used to identify individuals from the dataset, either through suppression, generalization, anatomization, permutation, or perturbation. Information suppression remains the most common, safe, and widely applicable anonymization method, though at a high data utility cost. In this paper, we argue that even information suppression may not be sufficient in some cases. We study the case of a dataset that describes the shopping habits of a grocery store's customers. All identifiers and quasi-identifiers are removed from the dataset by suppression. However, by augmenting the data in a novel multi-step, iterative process, and building a neural network enriched with prior knowledge, we show that most sensitive information can be retrieved with an accuracy of 80%.
引用
收藏
页码:29714 / 29722
页数:9
相关论文
共 50 条