On the Robustness of Bayesian Neural Networks to Adversarial Attacks

被引：0

作者：

Bortolussi, Luca ^{[1
]}

Carbone, Ginevra ^{[2
]}

Laurenti, Luca ^{[3
]}

Patane, Andrea ^{[4
]}

Sanguinetti, Guido ^{[5
,6
]}

Wicker, Matthew ^{[7
]}

机构：

[1] Univ Trieste, Dept Math Informat & Geosci, Trieste, Italy

[2] Univ Trieste, Dept Math & Geosci, I-34128 Trieste, Italy

[3] TU Delft Univ, Delft Ctr Syst & Control, NL-2628 CN Delft, Netherlands

[4] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin D02 PN40, Ireland

[5] Scuola Int Super Studi Avanzati, SISSA, I-34136 Trieste, Italy

[6] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland

[7] Univ Oxford, Dept Comp Sci, Oxford OX1 3QG, England

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

关键词：

Training; Adversarial attacks; adversarial robustness; Bayesian inference; Bayesian neural networks (BNNs); GAUSSIAN PROCESS;

D O I：

10.1109/TNNLS.2024.3386642

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.

引用

页码：1 / 14

页数：14

共 50 条

[1] Robustness of Bayesian Neural Networks to White-Box Adversarial Attacks
Uchendu, Adaku
Campoy, Daniel
Menart, Christopher
Hildenbrandt, Alexandra
2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 72 - 80
[2] Relative Robustness of Quantized Neural Networks Against Adversarial Attacks
Duncan, Kirsty
Komendantskaya, Ekaterina
Stewart, Robert
Lones, Michael
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[3] Robustness of Bayesian Neural Networks to Gradient-Based Attacks
Carbone, Ginevra
Wicker, Matthew
Laurenti, Luca
Patane, Andrea
Bortolussi, Luca
Sanguinetti, Guido
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[4] Robustness of Sparsely Distributed Representations to Adversarial Attacks in Deep Neural Networks
Sardar, Nida
Khan, Sundas
Hintze, Arend
Mehra, Priyanka
ENTROPY, 2023, 25 (06)
[5] Robustness Against Adversarial Attacks in Neural Networks Using Incremental Dissipativity
Aquino, Bernardo
Rahnama, Arash
Seiler, Peter
Lin, Lizhen
Gupta, Vijay
IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2341 - 2346
[6] Improving Robustness Against Adversarial Attacks with Deeply Quantized Neural Networks
Ayaz, Ferheen
Zakariyya, Idris
Cano, Jose
Keoh, Sye Loong
Singer, Jeremy
Pau, Danilo
Kharbouche-Harrari, Mounia
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[7] Improving Robustness Against Adversarial Attacks with Deeply Quantized Neural Networks
Ayaz, Ferheen
Zakariyya, Idris
Cano, José
Keoh, Sye Loong
Singer, Jeremy
Pau, Danilo
Kharbouche-Harrari, Mounia
arXiv, 2023,
[8] Robustness and Transferability of Adversarial Attacks on Different Image Classification Neural Networks
Smagulova, Kamilya
Bacha, Lina
Fouda, Mohammed E.
Kanj, Rouwaida
Eltawil, Ahmed
ELECTRONICS, 2024, 13 (03)
[9] MRobust: A Method for Robustness against Adversarial Attacks on Deep Neural Networks
Liu, Yi-Ling
Lomuscio, Alessio
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[10] Improving adversarial robustness of Bayesian neural networks via multi-task adversarial training
Chen, Xu
Liu, Chuancai
Zhao, Yue
Jia, Zhiyang
Jin, Ge
INFORMATION SCIENCES, 2022, 592 : 156 - 173

← 1 2 3 4 5 →