Unlearning Backdoor Attacks through Gradient-Based Model Pruning

被引：0

作者：

Dunnett, Kealan ^{[1
,2
]}

Arablouei, Reza ^{[2
]}

Miller, Dimity ^{[1
]}

Dedeoglu, Volkan ^{[1
,2
]}

Jurdak, Raja ^{[1
]}

机构：

[1] Queensland Univ Technol, Brisbane, Qld, Australia

[2] CSIROs Data61, Canberra, ACT, Australia

来源：

2024 54TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS, DSN-W 2024 | 2024年

关键词：

backdoor attack; backdoor mitigation; model pruning; unlearning;

D O I：

10.1109/DSN-W60302.2024.00021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the era of increasing concerns over cybersecurity threats, defending against backdoor attacks is paramount in ensuring the integrity and reliability of machine learning models. However, many existing approaches require substantial amounts of data for effective mitigation, posing significant challenges in practical deployment. To address this, we propose a novel approach to counter backdoor attacks by treating their mitigation as an unlearning task. We tackle this challenge through a targeted model pruning strategy, leveraging unlearning loss gradients to identify and eliminate backdoor elements within the model. Built on solid theoretical insights, our approach offers simplicity and effectiveness, rendering it well-suited for scenarios with limited data availability. Our methodology includes formulating a suitable unlearning loss and devising a model-pruning technique tailored for convolutional neural networks. Comprehensive evaluations demonstrate the efficacy of our proposed approach compared to state-of-the-art approaches, particularly in realistic data settings.

引用

页码：46 / 54

页数：9

共 50 条

[31] Model-reduced gradient-based history matching
Małgorzata P. Kaleta
Remus G. Hanea
Arnold W. Heemink
Jan-Dirk Jansen
Computational Geosciences, 2011, 15 : 135 - 153
[32] Gradient-Based Supported Model Computation in Vector Spaces
Takemura, Akihiro
Inoue, Katsumi
LOGIC PROGRAMMING AND NONMONOTONIC REASONING, LPNMR 2022, 2022, 13416 : 336 - 349
[33] Model Debiasing via Gradient-based Explanation on Representation
Zhang, Jindi
Wang, Luning
Su, Dan
Huang, Yongxiang
Cao, Caleb Chen
Chen, Lei
PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 193 - 204
[34] A Gradient-based reinforcement learning model of market equilibration
He, Zhongzhi
JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2023, 152
[35] Model-reduced gradient-based history matching
Kaleta, Malgorzata P.
Hanea, Remus G.
Heemink, Arnold W.
Jansen, Jan-Dirk
COMPUTATIONAL GEOSCIENCES, 2011, 15 (01) : 135 - 153
[36] A Gradient-Based Constitutive Model for Shape Memory Alloys
Tabesh M.
Boyd J.
Lagoudas D.
Boyd, James (jgboyd@tamu.edu), 1600, Springer (03): : 84 - 108
[37] Gradient-based adaptation of continuous dynamic model structures
La Cava, William G.
Danai, Kourosh
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2016, 47 (01) : 249 - 263
[38] A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning
Gu, Naibin
Fu, Peng
Liu, Xiyu
Liu, Zhengxiao
Lin, Zheng
Wang, Weiping
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3508 - 3520
[39] Backdoor Defence for Voice Print Recognition Model Based on Speech Enhancement and Weight Pruning
Zhu, Jiawei
Chen, Lin
Xu, Dongwei
Zhao, Wenhong
IEEE Access, 2022, 10 : 114016 - 114023
[40] Detecting Backdoor Attacks on Deep Neural Networks Based on Model Parameters Analysis
Ma, Mingyuan
Li, Hu
Kuang, Xiaohui
2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 630 - 637

← 1 2 3 4 5 →