Backdoor Attacks via Machine Unlearning

被引：0

作者：

Liu, Zihao ^{[1
]}

Wang, Tianhao ^{[2
]}

Huai, Mengdi ^{[1
]}

Miao, Chenglin ^{[1
]}

机构：

[1] Iowa State Univ, Dept Comp Sci, Ames, IA 50011 USA

[2] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22903 USA

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13 | 2024年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a new paradigm to erase data from a model and protect user privacy, machine unlearning has drawn significant attention. However, existing studies on machine unlearning mainly focus on its effectiveness and efficiency, neglecting the security challenges introduced by this technique. In this paper, we aim to bridge this gap and study the possibility of conducting malicious attacks leveraging machine unlearning. Specifically, we consider the backdoor attack via machine unlearning, where an attacker seeks to inject a backdoor in the unlearned model by submitting malicious unlearning requests, so that the prediction made by the unlearned model can be changed when a particular trigger presents. In our study, we propose two attack approaches. The first attack approach does not require the attacker to poison any training data of the model. The attacker can achieve the attack goal only by requesting to unlearn a small subset of his contributed training data. The second approach allows the attacker to poison a few training instances with a pre-defined trigger upfront, and then activate the attack via submitting a malicious unlearning request. Both attack approaches are proposed with the goal of maximizing the attack utility while ensuring attack stealthiness. The effectiveness of the proposed attacks is demonstrated with different machine unlearning algorithms as well as different models on different datasets.

引用

页码：14115 / 14123

页数：9

共 50 条

[21] Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class
Doan, Khoa D.
Lao, Yingjie
Li, Ping
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[22] Enhancing robustness of backdoor attacks against backdoor defenses
Hu, Bin
Guo, Kehua
Ren, Sheng
Fang, Hui
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 269
[23] Invisible Backdoor Attacks on Deep Neural Networks Via Steganography and Regularization
Li, Shaofeng
Xue, Minhui
Zhao, Benjamin
Zhu, Haojin
Zhang, Xinpeng
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (05) : 2088 - 2105
[24] Learn to Forget: Machine Unlearning via Neuron Masking
Ma, Zhuo
Liu, Yang
Liu, Ximeng
Liu, Jian
Ma, Jianfeng
Ren, Kui
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (04) : 3194 - 3207
[25] Detecting Backdoor Attacks via Class Difference in Deep Neural Networks
Kwon, Hyun
IEEE ACCESS, 2020, 8 : 191049 - 191056
[26] Machine unlearning
Agarwal, Shubham
NEW SCIENTIST, 2023, 246 (3463) : 40 - 43
[27] SURVEY OF SECURITY AND DATA ATTACKS ON MACHINE UNLEARNING IN FINANCIAL AND E-COMMERCE
Brodzinski, Carl E.J.
arXiv,
[28] Hidden Trigger Backdoor Attacks
Saha, Aniruddha
Subramanya, Akshayvarun
Pirsiavash, Hamed
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11957 - 11965
[29] Backdoor Attacks on Crowd Counting
Sun, Yuhua
Zhang, Tailai
Ma, Xingjun
Zhou, Pan
Lou, Jian
Xu, Zichuan
Di, Xing
Cheng, Yu
Sun, Lichao
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5351 - 5360
[30] Spectral Signatures in Backdoor Attacks
Tran, Brandon
Li, Jerry
Madry, Aleksander
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

← 1 2 3 4 5 →