Visualizing and Analyzing the Topology of Neuron Activations in Deep Adversarial Training

被引:0
|
作者
Zhou, Youjia [1 ]
Zhou, Yi [1 ]
Ding, Jie [2 ]
Wang, Bei [1 ]
机构
[1] Univ Utah, Salt Lake City, UT 84112 USA
[2] Univ Minnesota Twin Cities, Minneapolis, MN USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep models are known to be vulnerable to data adversarial attacks, and many adversarial training techniques have been developed to improve their adversarial robustness. While data adversaries attack model predictions through modifying data, little is known about their impact on the neuron activations produced by the model, which play a crucial role in determining the model's predictions and interpretability. In this work, we aim to develop a topological understanding of adversarial training to enhance its interpretability. We analyze the topological structure-in particular, mapper graphs-of neuron activations of data samples produced by deep adversarial training. Each node of a mapper graph represents a cluster of activations, and two nodes are connected by an edge if their corresponding clusters have a nonempty intersection. We provide an interactive visualization tool that demonstrates the utility of our topological framework in exploring the activation space. We found that stronger attacks make the data samples more indistinguishable in the neuron activation space that leads to a lower accuracy. Our tool also provides a natural way to identify the vulnerable data samples that may be useful in improving model robustness.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Domain Adaptation with Adversarial Training on Penultimate Activations
    Sun, Tao
    Lu, Cheng
    Ling, Haibin
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9935 - 9943
  • [2] Low Curvature Activations Reduce Overfitting in Adversarial Training
    Singla, Vasu
    Singla, Sahil
    Feizi, Soheil
    Jacobs, David
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16403 - 16413
  • [3] Explaining the Behavior of Neuron Activations in Deep Neural Networks
    Wang, Longwei
    Wang, Chengfei
    Li, Yupeng
    Wang, Rui
    AD HOC NETWORKS, 2021, 111
  • [4] Analyzing and Visualizing Deep Neural Networks for Speech Recognition with Saliency-Adjusted Neuron Activation Profiles
    Krug, Andreas
    Ebrahimzadeh, Maral
    Alemann, Jost
    Johannsmeier, Jens
    Stober, Sebastian
    ELECTRONICS, 2021, 10 (11)
  • [5] Deep Recommendation With Adversarial Training
    Zhang, Chenyan
    Li, Jing
    Wu, Jia
    Liu, Donghua
    Chang, Jun
    Gao, Rong
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (04) : 1966 - 1978
  • [6] Analyzing Dynamic Adversarial Training Data in the Limit
    Wallace, Eric
    Williams, Adina
    Jia, Robin
    Kiela, Douwe
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 202 - 217
  • [7] Defense Against Adversarial Attacks Using Topology Aligning Adversarial Training
    Kuang, Huafeng
    Liu, Hong
    Lin, Xianming
    Ji, Rongrong
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3659 - 3673
  • [8] Visualizing abnormalities in chest radiographs through salient network activations in Deep Learning
    Sivaramakrishnan, R.
    Antani, S.
    Xue, Z.
    Candemir, S.
    Jaeger, S.
    Thoma, G. R.
    2017 IEEE LIFE SCIENCES CONFERENCE (LSC), 2017, : 71 - 74
  • [9] Adversarial Neuron Pruning Purifies Backdoored Deep Models
    Wu, Dongxian
    Wang, Yisen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [10] Reconstructing perceived faces from brain activations with deep adversarial neural decoding
    Gucluturk, Yagmur
    Guclu, Umut
    Seeliger, Katja
    Bosch, Sander
    van Lier, Rob
    van Gerven, Marcel
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30