Visualizing and Analyzing the Topology of Neuron Activations in Deep Adversarial Training

被引：0

作者：

Zhou, Youjia ^{[1
]}

Zhou, Yi ^{[1
]}

Ding, Jie ^{[2
]}

Wang, Bei ^{[1
]}

机构：

[1] Univ Utah, Salt Lake City, UT 84112 USA

[2] Univ Minnesota Twin Cities, Minneapolis, MN USA

来源：

TOPOLOGICAL, ALGEBRAIC AND GEOMETRIC LEARNING WORKSHOPS 2023, VOL 221 | 2023年 / 221卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep models are known to be vulnerable to data adversarial attacks, and many adversarial training techniques have been developed to improve their adversarial robustness. While data adversaries attack model predictions through modifying data, little is known about their impact on the neuron activations produced by the model, which play a crucial role in determining the model's predictions and interpretability. In this work, we aim to develop a topological understanding of adversarial training to enhance its interpretability. We analyze the topological structure-in particular, mapper graphs-of neuron activations of data samples produced by deep adversarial training. Each node of a mapper graph represents a cluster of activations, and two nodes are connected by an edge if their corresponding clusters have a nonempty intersection. We provide an interactive visualization tool that demonstrates the utility of our topological framework in exploring the activation space. We found that stronger attacks make the data samples more indistinguishable in the neuron activation space that leads to a lower accuracy. Our tool also provides a natural way to identify the vulnerable data samples that may be useful in improving model robustness.

引用

页数：11

共 50 条

[1] Domain Adaptation with Adversarial Training on Penultimate Activations
Sun, Tao
Lu, Cheng
Ling, Haibin
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9935 - 9943
[2] Low Curvature Activations Reduce Overfitting in Adversarial Training
Singla, Vasu
Singla, Sahil
Feizi, Soheil
Jacobs, David
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16403 - 16413
[3] Explaining the Behavior of Neuron Activations in Deep Neural Networks
Wang, Longwei
Wang, Chengfei
Li, Yupeng
Wang, Rui
AD HOC NETWORKS, 2021, 111
[4] Analyzing and Visualizing Deep Neural Networks for Speech Recognition with Saliency-Adjusted Neuron Activation Profiles
Krug, Andreas
Ebrahimzadeh, Maral
Alemann, Jost
Johannsmeier, Jens
Stober, Sebastian
ELECTRONICS, 2021, 10 (11)
[5] Deep Recommendation With Adversarial Training
Zhang, Chenyan
Li, Jing
Wu, Jia
Liu, Donghua
Chang, Jun
Gao, Rong
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (04) : 1966 - 1978
[6] Analyzing Dynamic Adversarial Training Data in the Limit
Wallace, Eric
Williams, Adina
Jia, Robin
Kiela, Douwe
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 202 - 217
[7] Defense Against Adversarial Attacks Using Topology Aligning Adversarial Training
Kuang, Huafeng
Liu, Hong
Lin, Xianming
Ji, Rongrong
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3659 - 3673
[8] Visualizing abnormalities in chest radiographs through salient network activations in Deep Learning
Sivaramakrishnan, R.
Antani, S.
Xue, Z.
Candemir, S.
Jaeger, S.
Thoma, G. R.
2017 IEEE LIFE SCIENCES CONFERENCE (LSC), 2017, : 71 - 74
[9] Adversarial Neuron Pruning Purifies Backdoored Deep Models
Wu, Dongxian
Wang, Yisen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[10] Reconstructing perceived faces from brain activations with deep adversarial neural decoding
Gucluturk, Yagmur
Guclu, Umut
Seeliger, Katja
Bosch, Sander
van Lier, Rob
van Gerven, Marcel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30

← 1 2 3 4 5 →