State-of-the-Art Privacy Attacks and Defenses on Graphs

被引:0
|
作者
Liu Y.-H. [1 ,2 ]
Chen H. [1 ,2 ]
Liu Y.-X. [1 ,2 ]
Zhao D. [1 ,2 ]
Li C.-P. [1 ,2 ]
机构
[1] Key Laboratory of Data Engineering and Knowledge Engineering of Education, Renmin University, Beijing
[2] School of Information, Renmin University, Beijing
来源
基金
中国国家自然科学基金;
关键词
Data collection; Data publication; Differential privacy; Graphs; Privacy-preserving;
D O I
10.11897/SP.J.1016.2022.00702
中图分类号
学科分类号
摘要
Graph, as a typical data type, can not only represent entities, but also relations and connections among entities. It has a preferable value for both use and study. Thus, the graph has been widely adopted in real-world applications and academic research, such as social networks, disease transmission networks, fraud detection et al. Though applied prevalently, the collection and publication of graphs are suffered from a strong privacy risk. Both the presence of a node or an edge and attributes on nodes and edges may be private information. The leakage of sensitive information can result in severe consequences for individuals, enterprises, and governments, which include but are not limited to life threats, reputation damages, and fall of market values. Therefore, it is imminent to study privacy-preserving methods for graph collection and publication. Directly applying the existing privacy-preserving techniques is insufficient for graph protection. First, strong data correlations put an obstacle. Adopting some of the privacy-preserving techniques straightforwardly on graphs may severely destroy data utility by damaging data correlations. While the other techniques cannot provide a strong privacy guarantee as data correlations may increase the privacy risks. Second, it is hard to protect all private information at one time. Graphs often involve abundant sensitive information. Protecting all kinds of sensitive information with existing privacy-preserving techniques may bring too much perturbance to remain a high data utility. Striking a good balance on privacy and data utility for designing privacy-preserving techniques on graphs is extremely challenging. Our survey makes a deep analysis of the privacy risks in the graph data collection and publication from three aspects: definition of private information, scenarios for privacy information leakage, the adversary models. Then, we conduct a comprehensive review on both privacy attacks and privacy defenses on graphs. The privacy attacks algorithms are roughly divided into types: seed-based attacks, seed-free attacks. By comparing these two types of attacks, we conclude that the seed-based attacks can achieve higher attacking accuracy by asking the adversaries equipped with strong background knowledge. On the contrary, seed-free attacks have a slightly lower attacking accuracy. Despite this, it is more practical, effective, and robust. In addition to attack algorithms, attack quantification methods are also presented in this work. For privacy defenses, we first introduce four types of privacy-preserving techniques for graphs including naïve anonymization, graph modification, clustering, and differential privacy. Then, we review different defending algorithms in both centralized settings and decentralized settings. Specifically, different strategies have been proposed for four types of graphs including adjacent matrices, statistics, random graph parameters, and synthetic graphs in both types of settings. After investigating the algorithms for privacy attacking and defending, we further analyze the defensive effect of existing algorithms against different attacks. At last, challenges faced in privacy-preserving technique development that still need to be worked on are pointed out. Accordingly, we propose possible new techniques that can be adopted to graphs and introduce new scenarios where new privacy risks are emerging. In summary, though many efforts have been put in studying privacy-preserving schemes on graphs, a lot of progress still needs to be made in the future. © 2022, Science Press. All right reserved.
引用
收藏
页码:702 / 734
页数:32
相关论文
共 184 条
  • [1] Ji S., Mittal P., Beyah R., Graph Data Anonymization, AttacksDe-Anonymization, and QuantificationDe-Anonymizability: A Survey, IEEE Communications Surveys & Tutorials, 19, 2, pp. 1305-1326, (2017)
  • [2] Liu K, Terzi E., Towards identity anonymization on graphs, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 93-106, (2008)
  • [3] Liu XY, Yang XC., Survey on privacy preserving techniques for publishing social network data, Journal of Software, 25, 3, pp. 576-590, (2014)
  • [4] Zou L., Chen L., Ozsu M.T., k-automorphism: a general framework for privacy preserving network publication, Proc. VLDB Endow, 2, 1, pp. 946-957, (2009)
  • [5] Chen X, Kepuska E, Mauw S, Et al., Active Re-identification Attacks on Periodically Released Dynamic Social Graphs, Proceedings of the European Symposium on Research in Computer Security, pp. 185-205, (2020)
  • [6] Korolova A., Et al., Link privacy in social networks, Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 289-298, (2008)
  • [7] Wondracek G, Holz T, Kirda E, Et al., A practical attack to de-anonymize social network users, Proceedings of the 2010 Ieee Symposium on Security and Privacy, pp. 223-238, (2010)
  • [8] Shirani F, Garg S, Erkip E., Optimal active social network de-anonymization using information thresholds, Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT), pp. 1445-1449, (2018)
  • [9] Backstrom L., Dwork C., Kleinberg J., Wherefore art thou r3579x? anonymized social networks, hidden patterns, and structural steganography, Proceedings of the 16th International Conference on World Wide Web, pp. 181-190, (2007)
  • [10] Hay M., Et al., Resisting structural re-identification in anonymized social networks, The VLDB Journal, 19, 6, pp. 797-823, (2010)