Unlearning Backdoor Attacks through Gradient-Based Model Pruning

被引:0
|
作者
Dunnett, Kealan [1 ,2 ]
Arablouei, Reza [2 ]
Miller, Dimity [1 ]
Dedeoglu, Volkan [1 ,2 ]
Jurdak, Raja [1 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
[2] CSIROs Data61, Canberra, ACT, Australia
关键词
backdoor attack; backdoor mitigation; model pruning; unlearning;
D O I
10.1109/DSN-W60302.2024.00021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the era of increasing concerns over cybersecurity threats, defending against backdoor attacks is paramount in ensuring the integrity and reliability of machine learning models. However, many existing approaches require substantial amounts of data for effective mitigation, posing significant challenges in practical deployment. To address this, we propose a novel approach to counter backdoor attacks by treating their mitigation as an unlearning task. We tackle this challenge through a targeted model pruning strategy, leveraging unlearning loss gradients to identify and eliminate backdoor elements within the model. Built on solid theoretical insights, our approach offers simplicity and effectiveness, rendering it well-suited for scenarios with limited data availability. Our methodology includes formulating a suitable unlearning loss and devising a model-pruning technique tailored for convolutional neural networks. Comprehensive evaluations demonstrate the efficacy of our proposed approach compared to state-of-the-art approaches, particularly in realistic data settings.
引用
收藏
页码:46 / 54
页数:9
相关论文
共 50 条
  • [1] Backdoor Attacks via Machine Unlearning
    Liu, Zihao
    Wang, Tianhao
    Huai, Mengdi
    Miao, Chenglin
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14115 - 14123
  • [2] Accelerating Attention through Gradient-Based Learned Runtime Pruning
    Li, Zheng
    Ghodrati, Soroush
    Yazdanbakhsh, Amir
    Esmaeilzadeh, Hadi
    Kang, Mingu
    PROCEEDINGS OF THE 2022 THE 49TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '22), 2022, : 902 - 915
  • [3] Descent-to-Delete: Gradient-Based Methods for Machine Unlearning
    Neel, Seth
    Roth, Aaron
    Sharifi-Malvajerdi, Saeed
    ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
  • [4] Improved Gradient-Based Adversarial Attacks for Quantized Networks
    Gupta, Kartik
    Ajanthan, Thalaiyasingam
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6810 - 6818
  • [5] Robustness of Bayesian Neural Networks to Gradient-Based Attacks
    Carbone, Ginevra
    Wicker, Matthew
    Laurenti, Luca
    Patane, Andrea
    Bortolussi, Luca
    Sanguinetti, Guido
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [6] Gradient-Based Enhancement Attacks in Biomedical Machine Learning
    Rosenblatt, Matthew
    Dadashkarimi, Javid
    Scheinost, Dustin
    CLINICAL IMAGE-BASED PROCEDURES, FAIRNESS OF AI IN MEDICAL IMAGING, AND ETHICAL AND PHILOSOPHICAL ISSUES IN MEDICAL IMAGING, CLIP 2023, FAIMI 2023, EPIMI 2023, 2023, 14242 : 301 - 312
  • [7] Gradient-based Adversarial Attacks against Text Transformers
    Guo, Chuan
    Sablayrolles, Alexandre
    Jegou, Herve
    Kiela, Douwe
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5747 - 5757
  • [8] Power Amplifier Behavioral Model Adaptive Pruning Using Conjugate Gradient-Based Greedy Algorithm
    Yao, Yao
    Li, Mingyu
    Zhang, Zhongming
    Li, Ruoyu
    He, Songbai
    Nakatake, Shigetoshi
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2017, 12 : S181 - S182
  • [9] UMA: Facilitating Backdoor Scanning via Unlearning-Based Model Ablation
    Zhao, Yue
    Li, Congyi
    Chen, Kai
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21823 - 21831
  • [10] Adaptive Gradient-based Word Saliency for adversarial text attacks
    Qi, Yupeng
    Yang, Xinghao
    Liu, Baodi
    Zhang, Kai
    Liu, Weifeng
    NEUROCOMPUTING, 2024, 590