Adversarial Training Time Attack Against Discriminative and Generative Convolutional Models

被引:4
|
作者
Chaudhury, Subhajit [1 ]
Roy, Hiya [1 ]
Mishra, Sourav [1 ]
Yamasaki, Toshihiko [1 ]
机构
[1] Univ Tokyo, Dept Informat & Commun Engn, Bunkyo Ku, Tokyo 1138656, Japan
基金
日本学术振兴会;
关键词
Training; Neural networks; Perturbation methods; Optimization; Testing; Noise measurement; Generative adversarial networks; Generalization in deep learning; data poisoning; adaptive optimization; training time attack; variational information bottleneck;
D O I
10.1109/ACCESS.2021.3101282
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we show that adversarial training time attacks by a few pixel modifications can cause undesirable overfitting in neural networks for both discriminative and generative models. We propose an evolutionary algorithm to search for an optimal pixel attack using a novel cost function inspired by domain adaptation literature to design our training time attack. The proposed cost function explicitly maximizes the generalization gap and domain divergence between clean and corrupted images. Empirical evaluations demonstrate that our adversarial training attack can achieve significantly low testing accuracy (with high training accuracy) on multiple datasets by just perturbing a single pixel in the training images. Even under the use of popular regularization techniques, we identify a significant performance drop compared to clean data training. Our attack is more successful than previous pixel-based training time attacks on state-of-the-art Convolutional Neural Networks (CNNs) architectures, as evidenced by significantly lower testing accuracy. Interestingly, we find that the choice of optimization plays an essential role in robustness against our attack. We empirically observe that Stochastic Gradient Descent (SGD) is resilient to the proposed adversarial training attack, different from adaptive optimization techniques such as the popular Adam optimizer. We identify that such vulnerabilities are caused due to over-reliance on the cross-entropy (CE) loss on highly predictive features. Therefore, we propose a robust loss function that maximizes the mutual information between latent features and input images, in addition to optimizing the CE loss. Finally, we show that the discriminator in Generative Adversarial Networks (GANs) can also be attacked by our proposed training time attack resulting in poor generative performance. Our paper is one of the first works to design attacks for generative models.
引用
收藏
页码:109241 / 109259
页数:19
相关论文
共 50 条
  • [1] On the Evaluation of Generative Adversarial Networks By Discriminative Models
    Torfi, Amirsina
    Beyki, Mohammadreza
    Fox, Edward A.
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 991 - 998
  • [2] Training Discriminative Models to Evaluate Generative Ones
    Lesort, Timothee
    Stoain, Andrei
    Goudou, Jean-Francois
    Filliat, David
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: IMAGE PROCESSING, PT III, 2019, 11729 : 604 - 619
  • [3] Font Creation Using Class Discriminative Deep Convolutional Generative Adversarial Networks
    Abe, Kotaro
    Iwana, Brian Kenji
    Holmer, Viktor Gosta
    Uchida, Seiichi
    [J]. PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 232 - 237
  • [4] Generative Transferable Adversarial Attack
    Li, Yifeng
    Zhang, Ya
    Zhang, Rui
    Wang, Yanfeng
    [J]. ICVIP 2019: PROCEEDINGS OF 2019 3RD INTERNATIONAL CONFERENCE ON VIDEO AND IMAGE PROCESSING, 2019, : 84 - 89
  • [5] ARTIFICIAL BANDWIDTH EXTENSION USING A CONDITIONAL GENERATIVE ADVERSARIAL NETWORK WITH DISCRIMINATIVE TRAINING
    Sautter, Jonas
    Faubel, Friedrich
    Buck, Markus
    Schmidt, Gerhard
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7005 - 7009
  • [6] Diversity Adversarial Training against Adversarial Attack on Deep Neural Networks
    Kwon, Hyun
    Lee, Jun
    [J]. SYMMETRY-BASEL, 2021, 13 (03):
  • [7] IDSGAN: Generative Adversarial Networks for Attack Generation Against Intrusion Detection
    Lin, Zilong
    Shi, Yong
    Xue, Zhi
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT III, 2022, 13282 : 79 - 91
  • [8] Hybrid generative-discriminative training of Gaussian mixture models
    Roth, Wolfgang
    Peharz, Robert
    Tschiatschek, Sebastian
    Pernkopf, Franz
    [J]. PATTERN RECOGNITION LETTERS, 2018, 112 : 131 - 137
  • [9] Discriminative Forests Improve Generative Diversity for Generative Adversarial Networks
    Chen, Junjie
    Li, Jiahao
    Song, Chen
    Li, Bin
    Chen, Qingcai
    Gao, Hongchang
    Wang, Wendy Hui
    Xu, Zenglin
    Shi, Xinghua
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11338 - 11345
  • [10] Iterative Training Attack: A Black-Box Adversarial Attack via Perturbation Generative Network
    Lei, Hong
    Jiang, Wei
    Zhan, Jinyu
    You, Shen
    Jin, Lingxin
    Xie, Xiaona
    Chang, Zhengwei
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (18)