On the Robustness of Bayesian Neural Networks to Adversarial Attacks

被引:0
|
作者
Bortolussi, Luca [1 ]
Carbone, Ginevra [2 ]
Laurenti, Luca [3 ]
Patane, Andrea [4 ]
Sanguinetti, Guido [5 ,6 ]
Wicker, Matthew [7 ]
机构
[1] Univ Trieste, Dept Math Informat & Geosci, Trieste, Italy
[2] Univ Trieste, Dept Math & Geosci, I-34128 Trieste, Italy
[3] TU Delft Univ, Delft Ctr Syst & Control, NL-2628 CN Delft, Netherlands
[4] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin D02 PN40, Ireland
[5] Scuola Int Super Studi Avanzati, SISSA, I-34136 Trieste, Italy
[6] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland
[7] Univ Oxford, Dept Comp Sci, Oxford OX1 3QG, England
关键词
Training; Adversarial attacks; adversarial robustness; Bayesian inference; Bayesian neural networks (BNNs); GAUSSIAN PROCESS;
D O I
10.1109/TNNLS.2024.3386642
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [21] A Hybrid Bayesian-Convolutional Neural Network for Adversarial Robustness
    Khong, Thi Thu Thao
    Nakada, Takashi
    Nakashima, Yasuhiko
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (07) : 1308 - 1319
  • [22] Adversarial attacks and adversarial robustness in computational pathology
    Narmin Ghaffari Laleh
    Daniel Truhn
    Gregory Patrick Veldhuizen
    Tianyu Han
    Marko van Treeck
    Roman D. Buelow
    Rupert Langer
    Bastian Dislich
    Peter Boor
    Volkmar Schulz
    Jakob Nikolas Kather
    Nature Communications, 13
  • [23] Robustness of Spiking Neural Networks Based on Time-to-First-Spike Encoding Against Adversarial Attacks
    Nomura, Osamu
    Sakemi, Yusuke
    Hosomi, Takeo
    Morie, Takashi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (09) : 3640 - 3644
  • [24] Adversarial attacks and adversarial robustness in computational pathology
    Ghaffari Laleh, Narmin
    Truhn, Daniel
    Veldhuizen, Gregory Patrick
    Han, Tianyu
    van Treeck, Marko
    Buelow, Roman D.
    Langer, Rupert
    Dislich, Bastian
    Boor, Peter
    Schulz, Volkmar
    Kather, Jakob Nikolas
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [25] Improving Bayesian Neural Networks by Adversarial Sampling
    Zhang, Jiaru
    Hua, Yang
    Song, Tao
    Wang, Hao
    Xue, Zhengui
    Ma, Ruhui
    Guan, Haibing
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10110 - 10117
  • [26] On the Robustness of Neural-Enhanced Video Streaming against Adversarial Attacks
    Zhou, Qihua
    Guo, Jingcai
    Guo, Song
    Li, Ruibin
    Zhang, Jie
    Wang, Bingjie
    Xu, Zhenda
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 17123 - 17131
  • [27] Chaotic neural network quantization and its robustness against adversarial attacks
    Osama, Alaa
    Gadallah, Samar I.
    Said, Lobna A.
    Radwan, Ahmed G.
    Fouda, Mohammed E.
    KNOWLEDGE-BASED SYSTEMS, 2024, 286
  • [28] A Comprehensive Analysis on Adversarial Robustness of Spiking Neural Networks
    Sharmin, Saima
    Panda, Priyadarshini
    Sarwar, Syed Shakib
    Lee, Chankyu
    Ponghiran, Wachirawit
    Roy, Kaushik
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [29] An orthogonal classifier for improving the adversarial robustness of neural networks
    Xu, Cong
    Li, Xiang
    Yang, Min
    INFORMATION SCIENCES, 2022, 591 : 251 - 262
  • [30] Statistical Guarantees for the Robustness of Bayesian Neural Networks
    Cardelli, Luca
    Kwiatkowska, Marta
    Laurenti, Luca
    Paoletti, Nicola
    Patane, Andrea
    Wicker, Matthew
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5693 - 5700