Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks

被引：0

作者：

Zheng, Zhihao ^{[1
]}

Hong, Pengyu ^{[1
]}

机构：

[1] Brandeis Univ, Dept Comp Sci, Waltham, MA 02453 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷

关键词：

GAME; GO;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It has been shown that deep neural network (DNN) based classifiers are vulnerable to human-imperceptive adversarial perturbations which can cause DNN classifiers to output wrong predictions with high confidence. We propose an unsupervised learning approach to detect adversarial inputs without any knowledge of attackers. Our approach tries to capture the intrinsic properties of a DNN classifier and uses them to detect adversarial inputs. The intrinsic properties used in this study are the output distributions of the hidden neurons in a DNN classifier presented with natural images. Our approach can be easily applied to any DNN classifiers or combined with other defense strategies to improve robustness. Experimental results show that our approach demonstrates state-of-the-art robustness in defending black-box and gray-box attacks.

引用

页数：10

共 50 条

[1] Robust Adversarial Attacks on Imperfect Deep Neural Networks in Fault Classification
Jiang, Xiaoyu
Kong, Xiangyin
Zheng, Junhua
Ge, Zhiqiang
Zhang, Xinmin
Song, Zhihuan
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024,
[2] Defending Against Adversarial Attacks in Deep Neural Networks
You, Suya
Kuo, C-C Jay
[J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS, 2019, 11006
[3] Detecting adversarial example attacks to deep neural networks
Carrara, Fabio
Falchi, Fabrizio
Caldelli, Roberto
Amato, Giuseppe
Fumarola, Roberta
Becarelli, Rudy
[J]. PROCEEDINGS OF THE 15TH INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2017,
[4] Not So Robust after All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks
Garaev, Roman
Rasheed, Bader
Khan, Adil Mehmood
[J]. ALGORITHMS, 2024, 17 (04)
[5] Exploring misclassifications of robust neural networks to enhance adversarial attacks
Leo Schwinn
René Raab
An Nguyen
Dario Zanca
Bjoern Eskofier
[J]. Applied Intelligence, 2023, 53 : 19843 - 19859
[6] Robust Heterogeneous Graph Neural Networks against Adversarial Attacks
Zhang, Mengmei
Wang, Xiao
Zhu, Meiqi
Shi, Chuan
Zhang, Zhiqiang
Zhou, Jun
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4363 - 4370
[7] Exploring misclassifications of robust neural networks to enhance adversarial attacks
Schwinn, Leo
Raab, Rene
Nguyen, An
Zanca, Dario
Eskofier, Bjoern
[J]. APPLIED INTELLIGENCE, 2023, 53 (17) : 19843 - 19859
[8] Detection of backdoor attacks using targeted universal adversarial perturbations for deep neural networks
Qu, Yubin
Huang, Song
Chen, Xiang
Wang, Xingya
Yao, Yongming
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 207
[9] Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks
Guo, Haoqiang
Peng, Lu
Zhang, Jian
Qi, Fang
Duan, Lide
[J]. 2019 TENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2019,
[10] Bluff: Interactively Deciphering Adversarial Attacks on Deep Neural Networks
Das, Nilaksh
Park, Haekyu
Wang, Zijie J.
Hohman, Fred
Firstman, Robert
Rogers, Emily
Chau, Duen Horng
[J]. 2020 IEEE VISUALIZATION CONFERENCE - SHORT PAPERS (VIS 2020), 2020, : 271 - 275

← 1 2 3 4 5 →