Backdoor Attacks via Machine Unlearning

被引：0

作者：

Liu, Zihao ^{[1
]}

Wang, Tianhao ^{[2
]}

Huai, Mengdi ^{[1
]}

Miao, Chenglin ^{[1
]}

机构：

[1] Iowa State Univ, Dept Comp Sci, Ames, IA 50011 USA

[2] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22903 USA

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13 | 2024年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a new paradigm to erase data from a model and protect user privacy, machine unlearning has drawn significant attention. However, existing studies on machine unlearning mainly focus on its effectiveness and efficiency, neglecting the security challenges introduced by this technique. In this paper, we aim to bridge this gap and study the possibility of conducting malicious attacks leveraging machine unlearning. Specifically, we consider the backdoor attack via machine unlearning, where an attacker seeks to inject a backdoor in the unlearned model by submitting malicious unlearning requests, so that the prediction made by the unlearned model can be changed when a particular trigger presents. In our study, we propose two attack approaches. The first attack approach does not require the attacker to poison any training data of the model. The attacker can achieve the attack goal only by requesting to unlearn a small subset of his contributed training data. The second approach allows the attacker to poison a few training instances with a pre-defined trigger upfront, and then activate the attack via submitting a malicious unlearning request. Both attack approaches are proposed with the goal of maximizing the attack utility while ensuring attack stealthiness. The effectiveness of the proposed attacks is demonstrated with different machine unlearning algorithms as well as different models on different datasets.

引用

页码：14115 / 14123

页数：9

共 50 条

[1] Unlearning Backdoor Attacks in Federated Learning
Wu, Chen
Zhu, Sencun
Mitra, Prasenjit
Wang, Wei
2024 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY, CNS 2024, 2024,
[2] Backdoor Defense with Machine Unlearning
Liu, Yang
Fan, Mingyuan
Chen, Cen
Liu, Ximeng
Ma, Zhuo
Wang, Li
Ma, Jianfeng
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 280 - 289
[3] Unlearning Backdoor Attacks through Gradient-Based Model Pruning
Dunnett, Kealan
Arablouei, Reza
Miller, Dimity
Dedeoglu, Volkan
Jurdak, Raja
2024 54TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS, DSN-W 2024, 2024, : 46 - 54
[4] Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks
Mu, Bingxu
Niu, Zhenxing
Wang, Le
Wang, Xue
Miao, Qiguang
Jin, Rong
Hua, Gang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 20495 - 20503
[5] Verifying in the Dark: Verifiable Machine Unlearning by Using Invisible Backdoor Triggers
Guo, Yu
Zhao, Yu
Hou, Saihui
Wang, Cong
Jia, Xiaohua
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 708 - 721
[6] Defending against gradient inversion attacks in federated learning via statistical machine unlearning
Gao, Kun
Zhu, Tianqing
Ye, Dayong
Zhou, Wanlei
KNOWLEDGE-BASED SYSTEMS, 2024, 299
[7] Hard to Forget: Poisoning Attacks on Certified Machine Unlearning
Marchant, Neil G.
Rubinstein, Benjamin I. P.
Alfeld, Scott
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7691 - 7700
[8] Learn What You Want to Unlearn: Unlearning Inversion Attacks against Machine Unlearning
Hu, Hongsheng
Wang, Shuo
Dong, Tian
Xue, Minhui
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 3257 - 3275
[9] Dynamic Backdoor Attacks Against Machine Learning Models
Salem, Ahmed
Wen, Rui
Backes, Michael
Ma, Shiqing
Zhang, Yang
2022 IEEE 7TH EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY (EUROS&P 2022), 2022, : 703 - 718
[10] Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples
Wei, Shaokui
Zhang, Mingda
Zha, Hongyuan
Wu, Baoyuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →