Self-Supervised Adversarial Example Detection by Disentangled Representation

被引:0
|
作者
Zhang, Zhaoxi [1 ]
Zhang, Leo Yu [2 ]
Zheng, Xufei [1 ]
Tian, Jinyu [3 ]
Zhou, Jiantao [3 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing, Peoples R China
[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[3] Univ Macau, Dept Comp & Informat Sci, Macau, Australia
关键词
D O I
10.1109/TrustCom56396.2022.00137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning models are known to be vulnerable to adversarial examples that are elaborately designed for malicious purposes and are imperceptible to the human perceptual system. Autoencoder, when trained solely over benign examples, has been widely used for (self-supervised) adversarial detection based on the assumption that adversarial examples yield larger reconstruction errors. However, because lacking adversarial examples in its training and the too strong generalization ability of autoencoder, this assumption does not always hold true in practice. To alleviate this problem, we explore how to detect adversarial examples with disentangled label/semantic features under the autoencoder structure. Specifically, we propose Disentangled Representationbased Reconstruction (DRR). In DRR, we train an autoencoder over both correctly paired label/semantic features and incorrectly paired label/semantic features to reconstruct benign and counterexamples. This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder. We compare our method with the state-of-the-art selfsupervised detection methods under different adversarial attacks and different victim models, and it exhibits better performance in various metrics (area under the ROC curve, true positive rate, and true negative rate) for most attack settings. Though DRR is initially designed for visual tasks only, we demonstrate that it can be easily extended for natural language tasks as well. Notably, different from other autoencoder-based detectors, our method can provide resistance to the adaptive adversary.
引用
收藏
页码:1000 / 1007
页数:8
相关论文
共 50 条
  • [1] Self-Supervised Learning Disentangled Group Representation as Feature
    Wang, Tan
    Yue, Zhongqi
    Huang, Jianqiang
    Sun, Qianru
    Zhang, Hanwang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning
    Yang, Kaiwen
    Zhou, Tianyi
    Tian, Xinmei
    Tao, Dacheng
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [3] Learning disentangled representation for self-supervised video object segmentation
    Hou, Wenjie
    Qin, Zheyun
    Xi, Xiaoming
    Lu, Xiankai
    Yin, Yilong
    [J]. Neurocomputing, 2022, 481 : 270 - 280
  • [4] Learning disentangled representation for self-supervised video object segmentation
    Hou, Wenjie
    Qin, Zheyun
    Xi, Xiaoming
    Lu, Xiankai
    Yin, Yilong
    [J]. NEUROCOMPUTING, 2022, 481 : 270 - 280
  • [5] Self-supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection
    Chen, Liang
    Zhang, Yong
    Song, Yibing
    Liu, Lingqiao
    Wang, Jue
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18689 - 18698
  • [6] A Comprehensive and Adversarial Approach to Self-Supervised Representation Learning
    Xu, Yi-Zhan
    Han, Sungwon
    Park, Sungwon
    Cha, Meeyoung
    Li, Cheng-Te
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 709 - 717
  • [7] Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction
    Mu, Zhaoxi
    Yang, Xinyu
    Sun, Sining
    Yang, Qing
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18815 - 18823
  • [8] Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
    Liu, Yuanyuan
    Wang, Wenbin
    Zhan, Yibing
    Feng, Shaoze
    Liu, Kejun
    Chen, Zhe
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9717 - 9728
  • [9] Augmentation Adversarial Training for Self-Supervised Speaker Representation Learning
    Kang, Jingu
    Huh, Jaesung
    Heo, Hee Soo
    Chung, Joon Son
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1253 - 1262
  • [10] Equine Pain Behavior Classification via Self-Supervised Disentangled Pose Representation
    Rashid, Maheen
    Broome, Sofia
    Ask, Katrina
    Hernlund, Elin
    Andersen, Pia Haubro
    Kjellstrom, Hedvig
    Lee, Yong Jae
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 152 - 162