Unlearning Backdoor Attacks through Gradient-Based Model Pruning

被引:0
|
作者
Dunnett, Kealan [1 ,2 ]
Arablouei, Reza [2 ]
Miller, Dimity [1 ]
Dedeoglu, Volkan [1 ,2 ]
Jurdak, Raja [1 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
[2] CSIROs Data61, Canberra, ACT, Australia
关键词
backdoor attack; backdoor mitigation; model pruning; unlearning;
D O I
10.1109/DSN-W60302.2024.00021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the era of increasing concerns over cybersecurity threats, defending against backdoor attacks is paramount in ensuring the integrity and reliability of machine learning models. However, many existing approaches require substantial amounts of data for effective mitigation, posing significant challenges in practical deployment. To address this, we propose a novel approach to counter backdoor attacks by treating their mitigation as an unlearning task. We tackle this challenge through a targeted model pruning strategy, leveraging unlearning loss gradients to identify and eliminate backdoor elements within the model. Built on solid theoretical insights, our approach offers simplicity and effectiveness, rendering it well-suited for scenarios with limited data availability. Our methodology includes formulating a suitable unlearning loss and devising a model-pruning technique tailored for convolutional neural networks. Comprehensive evaluations demonstrate the efficacy of our proposed approach compared to state-of-the-art approaches, particularly in realistic data settings.
引用
收藏
页码:46 / 54
页数:9
相关论文
共 50 条
  • [21] Leaping through Time with Gradient-Based Adaptation for Recommendation
    Chairatanakul, Nuttapong
    Hoang, N. T.
    Liu, Xin
    Murata, Tsuyoshi
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6141 - 6149
  • [22] Gradient-based training and pruning of radial basis function networks with an application in materials physics
    Maatta, Jussi
    Bazaliy, Viacheslav
    Kimari, Jyri
    Djurabekova, Flyura
    Nordlund, Kai
    Roos, Teemu
    NEURAL NETWORKS, 2021, 133 : 123 - 131
  • [23] Gradient-based Hyperparameter Optimization through Reversible Learning
    Maclaurin, Dougal
    Duvenaud, David
    Adams, Ryan P.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2113 - 2122
  • [24] Gradient-based Intra-attention Pruning on Pre-trained Language Models
    Yang, Ziqing
    Cui, Yiming
    Yao, Xin
    Wang, Shijin
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2775 - 2790
  • [25] Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation
    He, Dan
    Minh-Quang Pham
    Thanh-Le Ha
    Turchi, Marco
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 654 - 670
  • [26] TextTricker: Loss-based and gradient-based adversarial attacks on text classification models
    Xu, Jincheng
    Du, Qingfeng
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 92
  • [27] Gradient-based model calibration with proxy-model assistance
    Burrows, Wesley
    Doherty, John
    JOURNAL OF HYDROLOGY, 2016, 533 : 114 - 127
  • [28] Research on Anti-Backdoor Learning Мethod Based on Preposed Unlearning
    Wang, Hanxu
    Li, Xin
    Xu, Wentao
    Si, Binzhou
    Computer Engineering and Applications, 2024, 60 (19) : 259 - 267
  • [29] Revisiting Gradient Pruning: A Dual Realization for Defending against Gradient Attacks
    Xue, Lulu
    Hu, Shengshan
    Zhao, Ruizhi
    Zhang, Leo Yu
    Hu, Shengqing
    Sun, Lichao
    Yao, Dezhong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6404 - 6412
  • [30] Backdoor Defence for Voice Print Recognition Model Based on Speech Enhancement and Weight Pruning
    Zhu, Jiawei
    Chen, Lin
    Xu, Dongwei
    Zhao, Wenhong
    IEEE ACCESS, 2022, 10 : 114016 - 114023