On the Robustness of Bayesian Neural Networks to Adversarial Attacks

被引：0

作者：

Bortolussi, Luca ^{[1
]}

Carbone, Ginevra ^{[2
]}

Laurenti, Luca ^{[3
]}

Patane, Andrea ^{[4
]}

Sanguinetti, Guido ^{[5
,6
]}

Wicker, Matthew ^{[7
]}

机构：

[1] Univ Trieste, Dept Math Informat & Geosci, Trieste, Italy

[2] Univ Trieste, Dept Math & Geosci, I-34128 Trieste, Italy

[3] TU Delft Univ, Delft Ctr Syst & Control, NL-2628 CN Delft, Netherlands

[4] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin D02 PN40, Ireland

[5] Scuola Int Super Studi Avanzati, SISSA, I-34136 Trieste, Italy

[6] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland

[7] Univ Oxford, Dept Comp Sci, Oxford OX1 3QG, England

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

关键词：

Training; Adversarial attacks; adversarial robustness; Bayesian inference; Bayesian neural networks (BNNs); GAUSSIAN PROCESS;

D O I：

10.1109/TNNLS.2024.3386642

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.

引用

页码：1 / 14

页数：14

共 50 条

[41] Exploring Adversarial Attacks on Neural Networks: An Explainable Approach
Renkhoff, Justus
Tan, Wenkai
Velasquez, Alvaro
Wang, William Yichen
Liu, Yongxin
Wang, Jian
Niu, Shuteng
Fazlic, Lejla Begic
Dartmann, Guido
Song, Houbing
2022 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, IPCCC, 2022,
[42] Exploring adversarial examples and adversarial robustness of convolutional neural networks by mutual information
Zhang J.
Qian W.
Cao J.
Xu D.
Neural Computing and Applications, 2024, 36 (23) : 14379 - 14394
[43] Bringing robustness against adversarial attacks
Gean T. Pereira
André C. P. L. F. de Carvalho
Nature Machine Intelligence, 2019, 1 : 499 - 500
[44] Bringing robustness against adversarial attacks
Pereira, Gean T.
de Carvalho, Andre C. P. L. F.
NATURE MACHINE INTELLIGENCE, 2019, 1 (11) : 499 - 500
[45] Adversarial Robustness of Multi-bit Convolutional Neural Networks
Frickenstein, Lukas
Sampath, Shambhavi Balamuthu
Mori, Pierpaolo
Vemparala, Manoj-Rohit
Fasfous, Nael
Frickenstein, Alexander
Unger, Christian
Passerone, Claudio
Stechele, Walter
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 3, INTELLISYS 2023, 2024, 824 : 157 - 174
[46] A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks
Wang, Yang
Dong, Bo
Xu, Ke
Piao, Haiyin
Ding, Yufei
Yin, Baocai
Yang, Xin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (05)
[47] Towards Improving Robustness of Deep Neural Networks to Adversarial Perturbations
Amini, Sajjad
Ghaemmaghami, Shahrokh
IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1889 - 1903
[48] Understanding Generalization in Neural Networks for Robustness against Adversarial Vulnerabilities
Chaudhury, Subhajit
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13714 - 13715
[49] Adversarial Robustness of Vision Transformers Versus Convolutional Neural Networks
Ali, Kazim
Bhatti, Muhammad Shahid
Saeed, Atif
Athar, Atifa
Al Ghamdi, Mohammed A.
Almotiri, Sultan H.
Akram, Samina
IEEE ACCESS, 2024, 12 : 105281 - 105293
[50] Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations
Tsai, Yu-Lin
Hsu, Chia-Yi
Yu, Chia-Mu
Chen, Pin-Yu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34

← 1 2 3 4 5 →