Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks

被引:0
|
作者
Zheng, Zhihao [1 ]
Hong, Pengyu [1 ]
机构
[1] Brandeis Univ, Dept Comp Sci, Waltham, MA 02453 USA
关键词
GAME; GO;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It has been shown that deep neural network (DNN) based classifiers are vulnerable to human-imperceptive adversarial perturbations which can cause DNN classifiers to output wrong predictions with high confidence. We propose an unsupervised learning approach to detect adversarial inputs without any knowledge of attackers. Our approach tries to capture the intrinsic properties of a DNN classifier and uses them to detect adversarial inputs. The intrinsic properties used in this study are the output distributions of the hidden neurons in a DNN classifier presented with natural images. Our approach can be easily applied to any DNN classifiers or combined with other defense strategies to improve robustness. Experimental results show that our approach demonstrates state-of-the-art robustness in defending black-box and gray-box attacks.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Robust Adversarial Attacks on Imperfect Deep Neural Networks in Fault Classification
    Jiang, Xiaoyu
    Kong, Xiangyin
    Zheng, Junhua
    Ge, Zhiqiang
    Zhang, Xinmin
    Song, Zhihuan
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024,
  • [2] Defending Against Adversarial Attacks in Deep Neural Networks
    You, Suya
    Kuo, C-C Jay
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS, 2019, 11006
  • [3] Detecting adversarial example attacks to deep neural networks
    Carrara, Fabio
    Falchi, Fabrizio
    Caldelli, Roberto
    Amato, Giuseppe
    Fumarola, Roberta
    Becarelli, Rudy
    [J]. PROCEEDINGS OF THE 15TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2017,
  • [4] Not So Robust after All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks
    Garaev, Roman
    Rasheed, Bader
    Khan, Adil Mehmood
    [J]. ALGORITHMS, 2024, 17 (04)
  • [5] Exploring misclassifications of robust neural networks to enhance adversarial attacks
    Leo Schwinn
    René Raab
    An Nguyen
    Dario Zanca
    Bjoern Eskofier
    [J]. Applied Intelligence, 2023, 53 : 19843 - 19859
  • [6] Robust Heterogeneous Graph Neural Networks against Adversarial Attacks
    Zhang, Mengmei
    Wang, Xiao
    Zhu, Meiqi
    Shi, Chuan
    Zhang, Zhiqiang
    Zhou, Jun
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4363 - 4370
  • [7] Exploring misclassifications of robust neural networks to enhance adversarial attacks
    Schwinn, Leo
    Raab, Rene
    Nguyen, An
    Zanca, Dario
    Eskofier, Bjoern
    [J]. APPLIED INTELLIGENCE, 2023, 53 (17) : 19843 - 19859
  • [8] Detection of backdoor attacks using targeted universal adversarial perturbations for deep neural networks
    Qu, Yubin
    Huang, Song
    Chen, Xiang
    Wang, Xingya
    Yao, Yongming
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 207
  • [9] Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks
    Guo, Haoqiang
    Peng, Lu
    Zhang, Jian
    Qi, Fang
    Duan, Lide
    [J]. 2019 TENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2019,
  • [10] Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks
    Das, Nilaksh
    Park, Haekyu
    Wang, Zijie J.
    Hohman, Fred
    Firstman, Robert
    Rogers, Emily
    Chau, Duen Horng
    [J]. 2020 IEEE VISUALIZATION CONFERENCE - SHORT PAPERS (VIS 2020), 2020, : 271 - 275