Self-Supervised Adversarial Example Detection by Disentangled Representation

被引:0
|
作者
Zhang, Zhaoxi [1 ]
Zhang, Leo Yu [2 ]
Zheng, Xufei [1 ]
Tian, Jinyu [3 ]
Zhou, Jiantao [3 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing, Peoples R China
[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[3] Univ Macau, Dept Comp & Informat Sci, Macau, Australia
关键词
D O I
10.1109/TrustCom56396.2022.00137
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning models are known to be vulnerable to adversarial examples that are elaborately designed for malicious purposes and are imperceptible to the human perceptual system. Autoencoder, when trained solely over benign examples, has been widely used for (self-supervised) adversarial detection based on the assumption that adversarial examples yield larger reconstruction errors. However, because lacking adversarial examples in its training and the too strong generalization ability of autoencoder, this assumption does not always hold true in practice. To alleviate this problem, we explore how to detect adversarial examples with disentangled label/semantic features under the autoencoder structure. Specifically, we propose Disentangled Representationbased Reconstruction (DRR). In DRR, we train an autoencoder over both correctly paired label/semantic features and incorrectly paired label/semantic features to reconstruct benign and counterexamples. This mimics the behavior of adversarial examples and can reduce the unnecessary generalization ability of autoencoder. We compare our method with the state-of-the-art selfsupervised detection methods under different adversarial attacks and different victim models, and it exhibits better performance in various metrics (area under the ROC curve, true positive rate, and true negative rate) for most attack settings. Though DRR is initially designed for visual tasks only, we demonstrate that it can be easily extended for natural language tasks as well. Notably, different from other autoencoder-based detectors, our method can provide resistance to the adaptive adversary.
引用
收藏
页码:1000 / 1007
页数:8
相关论文
共 50 条
  • [21] SELF-SUPERVISED ADVERSARIAL SHAPE COMPLETION
    Peters, Torben
    Schindler, Konrad
    Brenner, Claus
    [J]. XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II, 2022, 5-2 : 143 - 150
  • [22] A Self-supervised Approach for Adversarial Robustness
    Naseer, Muzammal
    Khan, Salman
    Hayat, Munawar
    Khan, Fahad Shahbaz
    Porikli, Fatih
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 259 - 268
  • [23] Adversarial Self-Supervised Contrastive Learning
    Kim, Minseon
    Tack, Jihoon
    Hwang, Sung Ju
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [24] Self-Supervised Generative Adversarial Compression
    Yu, Chong
    Pool, Jeff
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [25] Adversarial Masking for Self-Supervised Learning
    Shi, Yuge
    Siddharth, N.
    Torr, Philip H. S.
    Kosiorek, Adam R.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [26] Self-Supervised Adversarial Imitation Learning
    Monteiro, Juarez
    Gavenski, Nathan
    Meneguzzi, Felipe
    Barros, Rodrigo C.
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [27] DISENTANGLED SPEECH REPRESENTATION LEARNING BASED ON FACTORIZED HIERARCHICAL VARIATIONAL AUTOENCODER WITH SELF-SUPERVISED OBJECTIVE
    Xie, Yuying
    Arildsen, Thomas
    Tan, Zheng-Hua
    [J]. 2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [28] Self-supervised time-frequency representation based on generative adversarial networks
    Liu, Naihao
    Lei, Youbo
    Yang, Yang
    Wei, Shengtao
    Gao, Jinghuai
    Jiang, Xiudi
    [J]. GEOPHYSICS, 2023, 88 (04) : IM87 - IM99
  • [29] Self-supervised Graph-level Representation Learning with Adversarial Contrastive Learning
    Luo, Xiao
    Ju, Wei
    Gu, Yiyang
    Mao, Zhengyang
    Liu, Luchen
    Yuan, Yuhui
    Zhang, Ming
    [J]. ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (02)
  • [30] Self-Supervised Attentive Generative Adversarial Networks for Video Anomaly Detection
    Huang, Chao
    Wen, Jie
    Xu, Yong
    Jiang, Qiuping
    Yang, Jian
    Wang, Yaowei
    Zhang, David
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 9389 - 9403