Generating Semantic Adversarial Examples via Feature Manipulation in Latent Space

被引：0

作者：

Wang, Shuo ^{[1
]}

Chen, Shangyu ^{[2
]}

Chen, Tianle ^{[3
]}

Nepal, Surya ^{[1
]}

Rudolph, Carsten ^{[2
]}

Grobler, Marthie ^{[1
]}

机构：

[1] CSIRO, Data61 & Cybersecur CRC, Marsfield, NSW 2122, Australia

[2] Monash Univ, Fac Informat Technol, Melbourne, Vic 3800, Australia

[3] Univ Queensland, St Lucia, Qld 4072, Australia

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 12期

关键词：

Adversarial examples; feature manipulation; latent representation; neural networks; variational autoencoder (VAE);

D O I：

10.1109/TNNLS.2023.3299408

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The susceptibility of deep neural networks (DNNs) to adversarial intrusions, exemplified by adversarial examples, is well-documented. Conventional attacks implement unstructured, pixel-wise perturbations to mislead classifiers, which often results in a noticeable departure from natural samples and lacks human-perceptible interpretability. In this work, we present an adversarial attack strategy that implements fine-granularity, semantic-meaning-oriented structural perturbations. Our proposed methodology manipulates the semantic attributes of images through the use of disentangled latent codes. We engineer adversarial perturbations by manipulating either a single latent code or a combination thereof. To this end, we propose two unsupervised semantic manipulation strategies: one based on vector-disentangled representation and the other on feature map-disentangled representation, taking into consideration the complexity of the latent codes and the smoothness of the reconstructed images. Our empirical evaluations, conducted extensively on real-world image data, showcase the potency of our attacks, particularly against black-box classifiers. Furthermore, we establish the existence of a universal semantic adversarial example that is agnostic to specific images.

引用

页码：17070 / 17084

页数：15

共 50 条

[41] Generating adversarial examples with collaborative generative models
Lei Xu
Junhai Zhai
International Journal of Information Security, 2024, 23 : 1077 - 1091
[42] Generating Transferable Adversarial Examples for Speech Classification
Kim, Hoki
Park, Jinseong
Lee, Jaewook
PATTERN RECOGNITION, 2023, 137
[43] Generating adversarial examples with input significance indicator
Qiu, Xiaofeng
Zhou, Shuya
NEUROCOMPUTING, 2020, 394 : 1 - 12
[44] Generating Fluent Adversarial Examples for Natural Languages
Zhang, Huangzhao
Zhou, Hao
Miao, Ning
Li, Lei
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5564 - 5569
[45] Latent Space Factorisation and Manipulation via Matrix Subspace Projection
Li, Xiao
Lin, Chenghua
Li, Ruizhe
Wang, Chaozheng
Guerin, Frank
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[46] Generating Adversarial Examples with Better Transferability via Masking Unimportant Parameters of Surrogate Model
Yang, Dingcheng
Yu, Wenjian
Xiao, Zihao
Luo, Jiaqi
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[47] Generating Adversarial Examples by Adversarial Networks for Semi-supervised Learning
Ma, Yun
Mao, Xudong
Chen, Yangbin
Li, Qing
WEB INFORMATION SYSTEMS ENGINEERING - WISE 2019, 2019, 11881 : 115 - 129
[48] Defending against and generating adversarial examples together with generative adversarial networks
Ying Wang
Xiao Liao
Wei Cui
Yang Yang
Scientific Reports, 15 (1)
[49] Facial Image Manipulation via Discriminative Decomposition of Semantic Space
Zheng, Jiazhou
Aizawa, Hiroaki
Kurita, Takio
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[50] Adversarial Examples for Semantic Segmentation and Object Detection
Xie, Cihang
Wang, Jianyu
Zhang, Zhishuai
Zhou, Yuyin
Xie, Lingxi
Yuille, Alan
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1378 - 1387

← 1 2 3 4 5 →