Efficient Generation of Targeted and Transferable Adversarial Examples for Vision-Language Models via Diffusion Models

被引:0
|
作者
Guo, Qi [1 ,2 ]
Pang, Shanmin [1 ]
Jia, Xiaojun [3 ]
Liu, Yang [3 ]
Guo, Qing [2 ,4 ]
机构
[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China
[2] Agcy Sci Technol & Res, Ctr Frontier AI Res, Singapore 138632, Singapore
[3] Nanyang Technol Univ, Coll Comp & Data Sci, Singapore 639798, Singapore
[4] Agcy Sci Technol & Res, Inst High Performance Comp, Singapore 138632, Singapore
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Adversarial attack; visual language models; diffusion models; score matching;
D O I
10.1109/TIFS.2024.3518072
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Adversarial attacks, particularly targeted transfer-based attacks, can be used to assess the adversarial robustness of large visual-language models (VLMs), allowing for a more thorough examination of potential security flaws before deployment. However, previous transfer-based adversarial attacks incur high costs due to high iteration counts and complex method structure. Furthermore, due to the unnaturalness of adversarial semantics, the generated adversarial examples have low transferability. These issues limit the utility of existing methods for assessing robustness. To address these issues, we propose AdvDiffVLM, which uses diffusion models to generate natural, unrestricted and targeted adversarial examples via score matching. Specifically, AdvDiffVLM uses Adaptive Ensemble Gradient Estimation (AEGE) to modify the score during the diffusion model's reverse generation process, ensuring that the produced adversarial examples have natural adversarial targeted semantics, which improves their transferability. Simultaneously, to improve the quality of adversarial examples, we use the GradCAM-guided Mask Generation (GCMG) to disperse adversarial semantics throughout the image rather than concentrating them in a single area. Finally, AdvDiffVLM embeds more target semantics into adversarial examples after multiple iterations. Experimental results show that our method generates adversarial examples 5x to 10x faster than state-of-the-art (SOTA) transfer-based adversarial attacks while maintaining higher quality adversarial examples. Furthermore, compared to previous transfer-based adversarial attacks, the adversarial examples generated by our method have better transferability. Notably, AdvDiffVLM can successfully attack a variety of commercial VLMs in a black-box environment, including GPT-4V. The code is available at https://github.com/gq-max/AdvDiffVLM.
引用
收藏
页码:1333 / 1348
页数:16
相关论文
共 50 条
  • [41] Transferable adversarial distribution learning: Query-efficient adversarial attack against large language models
    Dong, Huoyuan
    Dong, Jialiang
    Wan, Shaohua
    Yuan, Shuai
    Guan, Zhitao
    COMPUTERS & SECURITY, 2023, 135
  • [42] Generating Robot Action Sequences: An Efficient Vision-Language Models with Visual Prompts
    Cai, Weihao
    Mori, Yoshiki
    Shimada, Nobutaka
    2024 INTERNATIONAL WORKSHOP ON INTELLIGENT SYSTEMS, IWIS 2024, 2024,
  • [43] Towards Better Vision-Inspired Vision-Language Models
    Cao, Yun-Hao
    Ji, Kaixiang
    Huang, Ziyuan
    Zheng, Chuanyang
    Liu, Jiajia
    Wang, Jian
    Chen, Jingdong
    Yang, Ming
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13537 - 13547
  • [44] Patch is enough: naturalistic adversarial patch against vision-language pre-training models
    Dehong Kong
    Siyuan Liang
    Xiaopeng Zhu
    Yuansheng Zhong
    Wenqi Ren
    Visual Intelligence, 2 (1):
  • [45] PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning
    Hussein, Noor
    Shamshad, Fahad
    Naseer, Muzammal
    Nandakumar, Karthik
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII, 2024, 15012 : 698 - 708
  • [46] Concept-Based Analysis of Neural Networks via Vision-Language Models
    Mangal, Ravi
    Narodytska, Nina
    Gopinath, Divya
    Hu, Boyue Caroline
    Roy, Anirban
    Jha, Susmit
    Pasareanu, Corina S.
    AI VERIFICATION, SAIV 2024, 2024, 14846 : 49 - 77
  • [47] VinVL: Revisiting Visual Representations in Vision-Language Models
    Zhang, Pengchuan
    Li, Xiujun
    Hu, Xiaowei
    Yang, Jianwei
    Zhang, Lei
    Wang, Lijuan
    Choi, Yejin
    Gao, Jianfeng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5575 - 5584
  • [48] Evaluating Attribute Comprehension in Large Vision-Language Models
    Zhang, Haiwen
    Yang, Zixi
    Liu, Yuanzhi
    Wang, Xinran
    He, Zheqi
    Liang, Kongming
    Ma, Zhanyu
    PATTERN RECOGNITION AND COMPUTER VISION, PT V, PRCV 2024, 2025, 15035 : 98 - 113
  • [49] Towards an Exhaustive Evaluation of Vision-Language Foundation Models
    Salin, Emmanuelle
    Ayache, Stephane
    Favre, Benoit
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 339 - 352
  • [50] Attention Prompting on Image for Large Vision-Language Models
    Yu, Runpeng
    Yu, Weihao
    Wang, Xinchao
    COMPUTER VISION - ECCV 2024, PT XXX, 2025, 15088 : 251 - 268