Backdoor Attacks via Machine Unlearning

被引:0
|
作者
Liu, Zihao [1 ]
Wang, Tianhao [2 ]
Huai, Mengdi [1 ]
Miao, Chenglin [1 ]
机构
[1] Iowa State Univ, Dept Comp Sci, Ames, IA 50011 USA
[2] Univ Virginia, Dept Comp Sci, Charlottesville, VA 22903 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a new paradigm to erase data from a model and protect user privacy, machine unlearning has drawn significant attention. However, existing studies on machine unlearning mainly focus on its effectiveness and efficiency, neglecting the security challenges introduced by this technique. In this paper, we aim to bridge this gap and study the possibility of conducting malicious attacks leveraging machine unlearning. Specifically, we consider the backdoor attack via machine unlearning, where an attacker seeks to inject a backdoor in the unlearned model by submitting malicious unlearning requests, so that the prediction made by the unlearned model can be changed when a particular trigger presents. In our study, we propose two attack approaches. The first attack approach does not require the attacker to poison any training data of the model. The attacker can achieve the attack goal only by requesting to unlearn a small subset of his contributed training data. The second approach allows the attacker to poison a few training instances with a pre-defined trigger upfront, and then activate the attack via submitting a malicious unlearning request. Both attack approaches are proposed with the goal of maximizing the attack utility while ensuring attack stealthiness. The effectiveness of the proposed attacks is demonstrated with different machine unlearning algorithms as well as different models on different datasets.
引用
收藏
页码:14115 / 14123
页数:9
相关论文
共 50 条
  • [41] Defending against Insertion-based Textual Backdoor Attacks via Attribution
    Li, Jiazhao
    Wu, Zhuofeng
    Ping, Wei
    Xiao, Chaowei
    Vydiswaran, V. G. Vinod
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8818 - 8833
  • [42] Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution
    Qi, Fanchao
    Yao, Yuan
    Xu, Sophia
    Liu, Zhiyuan
    Sun, Maosong
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4873 - 4883
  • [43] FDNet: Imperceptible backdoor attacks via frequency domain steganography and negative sampling
    Dong, Liang
    Fu, Zhongwang
    Chen, Leiyang
    Ding, Hongwei
    Zheng, Chengliang
    Cui, Xiaohui
    Shen, Zhidong
    NEUROCOMPUTING, 2024, 583
  • [44] Detecting Textual Backdoor Attacks via Class Difference for Text Classification System
    Kwon, Hyun
    Lee, Jun
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2025, E108D (02) : 114 - 123
  • [45] DATAELIXIR: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models
    Zhou, Jiachen
    Lv, Peizhuo
    Lan, Yibing
    Meng, Guozhu
    Chen, Kai
    Ma, Hualong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21850 - 21858
  • [46] The Design and Development of a Game to Study Backdoor Poisoning Attacks: The Backdoor Game
    Ashktorab, Zahra
    Dugan, Casey
    Johnson, James
    Sharma, Aabhas
    Torres, Dustin Ramsey
    Lange, Ingrid
    Hoover, Benjamin
    Ludwig, Heiko
    Chen, Bryant
    Baracaldo, Nathalie
    Geyer, Werner
    Pan, Qian
    IUI '21 - 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2021, : 423 - 433
  • [47] Adaptive Machine Unlearning
    Gupta, Varun
    Jung, Christopher
    Neel, Seth
    Roth, Aaron
    Sharifi-Malvajerdi, Saeed
    Waites, Chris
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [48] A Review on Machine Unlearning
    Zhang H.
    Nakamura T.
    Isohara T.
    Sakurai K.
    SN Computer Science, 4 (4)
  • [49] Clean-Image Backdoor Attacks
    Rong, Dazhong
    Yu, Guoyao
    Shen, Shuheng
    Fu, Xinyi
    Qian, Peng
    Chen, Jianhai
    He, Qinming
    Fu, Xing
    Wang, Weiqiang
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT X, 2024, 15025 : 187 - 202
  • [50] Backdoor Attacks against Learning Systems
    Ji, Yujie
    Zhang, Xinyang
    Wang, Ting
    2017 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2017, : 191 - 199