Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models via Diffusion Models

被引：0

作者：

Guo, Qi ^{[1
,2
]}

Pang, Shanmin ^{[1
]}

Jia, Xiaojun ^{[3
]}

Liu, Yang ^{[3
]}

Guo, Qing ^{[2
,4
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China

[2] Agcy Sci Technol & Res, Ctr Frontier AI Res, Singapore 138632, Singapore

[3] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore

[4] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY | 2025年 / 20卷

基金：

中国国家自然科学基金; 新加坡国家研究基金会;

关键词：

Adversarial attack; visual language models; diffusion models; score matching;

D O I：

10.1109/TIFS.2024.3518072

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Adversarial attacks, particularly targeted transfer-based attacks, can be used to assess the adversarial robustness of large visual-language models (VLMs), allowing for a more thorough examination of potential security flaws before deployment. However, previous transfer-based adversarial attacks incur high costs due to high iteration counts and complex method structure. Furthermore, due to the unnaturalness of adversarial semantics, the generated adversarial examples have low transferability. These issues limit the utility of existing methods for assessing robustness. To address these issues, we propose AdvDiffVLM, which uses diffusion models to generate natural, unrestricted and targeted adversarial examples via score matching. Specifically, AdvDiffVLM uses Adaptive Ensemble Gradient Estimation (AEGE) to modify the score during the diffusion model's reverse generation process, ensuring that the produced adversarial examples have natural adversarial targeted semantics, which improves their transferability. Simultaneously, to improve the quality of adversarial examples, we use the GradCAM-guided Mask Generation (GCMG) to disperse adversarial semantics throughout the image rather than concentrating them in a single area. Finally, AdvDiffVLM embeds more target semantics into adversarial examples after multiple iterations. Experimental results show that our method generates adversarial examples 5x to 10x faster than state-of-the-art (SOTA) transfer-based adversarial attacks while maintaining higher quality adversarial examples. Furthermore, compared to previous transfer-based adversarial attacks, the adversarial examples generated by our method have better transferability. Notably, AdvDiffVLM can successfully attack a variety of commercial VLMs in a black-box environment, including GPT-4V. The code is available at https://github.com/gq-max/AdvDiffVLM.

引用

页码：1333 / 1348

页数：16

共 50 条

[41] Transferable adversarial distribution learning: Query-efficient adversarial attack against large language models
Dong, Huoyuan
Dong, Jialiang
Wan, Shaohua
Yuan, Shuai
Guan, Zhitao
COMPUTERS & SECURITY, 2023, 135
[42] Generating Robot Action Sequences: An Efficient Vision-Language Models with Visual Prompts
Cai, Weihao
Mori, Yoshiki
Shimada, Nobutaka
2024 INTERNATIONAL WORKSHOP ON INTELLIGENT SYSTEMS, IWIS 2024, 2024,
[43] Towards Better Vision-Inspired Vision-Language Models
Cao, Yun-Hao
Ji, Kaixiang
Huang, Ziyuan
Zheng, Chuanyang
Liu, Jiajia
Wang, Jian
Chen, Jingdong
Yang, Ming
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13537 - 13547
[44] Patch is enough: naturalistic adversarial patch against vision-language pre-training models
Dehong Kong
Siyuan Liang
Xiaopeng Zhu
Yuansheng Zhong
Wenqi Ren
Visual Intelligence, 2 (1):
[45] PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning
Hussein, Noor
Shamshad, Fahad
Naseer, Muzammal
Nandakumar, Karthik
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 698 - 708
[46] Concept-Based Analysis of Neural Networks via Vision-Language Models
Mangal, Ravi
Narodytska, Nina
Gopinath, Divya
Hu, Boyue Caroline
Roy, Anirban
Jha, Susmit
Pasareanu, Corina S.
AI VERIFICATION, SAIV 2024, 2024, 14846 : 49 - 77
[47] VinVL: Revisiting Visual Representations in Vision-Language Models
Zhang, Pengchuan
Li, Xiujun
Hu, Xiaowei
Yang, Jianwei
Zhang, Lei
Wang, Lijuan
Choi, Yejin
Gao, Jianfeng
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5575 - 5584
[48] Evaluating Attribute Comprehension in Large Vision-Language Models
Zhang, Haiwen
Yang, Zixi
Liu, Yuanzhi
Wang, Xinran
He, Zheqi
Liang, Kongming
Ma, Zhanyu
PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 98 - 113
[49] Towards an Exhaustive Evaluation of Vision-Language Foundation Models
Salin, Emmanuelle
Ayache, Stephane
Favre, Benoit
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 339 - 352
[50] Attention Prompting on Image for Large Vision-Language Models
Yu, Runpeng
Yu, Weihao
Wang, Xinchao
COMPUTER VISION - ECCV 2024, PT XXX, 2025, 15088 : 251 - 268

← 1 2 3 4 5 →