Unlearning Backdoor Attacks through Gradient-Based Model Pruning

被引：0

作者：

Dunnett, Kealan ^{[1
,2
]}

Arablouei, Reza ^{[2
]}

Miller, Dimity ^{[1
]}

Dedeoglu, Volkan ^{[1
,2
]}

Jurdak, Raja ^{[1
]}

机构：

[1] Queensland Univ Technol, Brisbane, Qld, Australia

[2] CSIROs Data61, Canberra, ACT, Australia

来源：

2024 54TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS, DSN-W 2024 | 2024年

关键词：

backdoor attack; backdoor mitigation; model pruning; unlearning;

D O I：

10.1109/DSN-W60302.2024.00021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the era of increasing concerns over cybersecurity threats, defending against backdoor attacks is paramount in ensuring the integrity and reliability of machine learning models. However, many existing approaches require substantial amounts of data for effective mitigation, posing significant challenges in practical deployment. To address this, we propose a novel approach to counter backdoor attacks by treating their mitigation as an unlearning task. We tackle this challenge through a targeted model pruning strategy, leveraging unlearning loss gradients to identify and eliminate backdoor elements within the model. Built on solid theoretical insights, our approach offers simplicity and effectiveness, rendering it well-suited for scenarios with limited data availability. Our methodology includes formulating a suitable unlearning loss and devising a model-pruning technique tailored for convolutional neural networks. Comprehensive evaluations demonstrate the efficacy of our proposed approach compared to state-of-the-art approaches, particularly in realistic data settings.

引用

页码：46 / 54

页数：9

共 50 条

[21] Leaping through Time with Gradient-Based Adaptation for Recommendation
Chairatanakul, Nuttapong
Hoang, N. T.
Liu, Xin
Murata, Tsuyoshi
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6141 - 6149
[22] Gradient-based training and pruning of radial basis function networks with an application in materials physics
Maatta, Jussi
Bazaliy, Viacheslav
Kimari, Jyri
Djurabekova, Flyura
Nordlund, Kai
Roos, Teemu
NEURAL NETWORKS, 2021, 133 : 123 - 131
[23] Gradient-based Hyperparameter Optimization through Reversible Learning
Maclaurin, Dougal
Duvenaud, David
Adams, Ryan P.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2113 - 2122
[24] Gradient-based Intra-attention Pruning on Pre-trained Language Models
Yang, Ziqing
Cui, Yiming
Yao, Xin
Wang, Shijin
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 2775 - 2790
[25] Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation
He, Dan
Minh-Quang Pham
Thanh-Le Ha
Turchi, Marco
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 654 - 670
[26] TextTricker: Loss-based and gradient-based adversarial attacks on text classification models
Xu, Jincheng
Du, Qingfeng
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 92
[27] Gradient-based model calibration with proxy-model assistance
Burrows, Wesley
Doherty, John
JOURNAL OF HYDROLOGY, 2016, 533 : 114 - 127
[28] Research on Anti-Backdoor Learning Мethod Based on Preposed Unlearning
Wang, Hanxu
Li, Xin
Xu, Wentao
Si, Binzhou
Computer Engineering and Applications, 2024, 60 (19) : 259 - 267
[29] Revisiting Gradient Pruning: A Dual Realization for Defending against Gradient Attacks
Xue, Lulu
Hu, Shengshan
Zhao, Ruizhi
Zhang, Leo Yu
Hu, Shengqing
Sun, Lichao
Yao, Dezhong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6404 - 6412
[30] Backdoor Defence for Voice Print Recognition Model Based on Speech Enhancement and Weight Pruning
Zhu, Jiawei
Chen, Lin
Xu, Dongwei
Zhao, Wenhong
IEEE ACCESS, 2022, 10 : 114016 - 114023

← 1 2 3 4 5 →