PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning

被引：0

作者：

Hussein, Noor ^{[1
]}

Shamshad, Fahad ^{[1
]}

Naseer, Muzammal ^{[1
]}

Nandakumar, Karthik ^{[1
]}

机构：

[1] Mohamed Bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XII | 2024年 / 15012卷

关键词：

Certified Robustness; Medical Vision-Language Models; Prompt tuning; Randomized smoothing;

D O I：

10.1007/978-3-031-72390-2_65

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Medical vision-language models (Med-VLMs) trained on large datasets of medical image-text pairs and later fine-tuned for specific tasks have emerged as a mainstream paradigm in medical image analysis. However, recent studies have highlighted the susceptibility of these Med-VLMs to adversarial attacks, raising concerns about their safety and robustness. Randomized smoothing is a well-known technique for turning any classifier into a model that is certifiably robust to adversarial perturbations. However, this approach requires retraining the Med-VLM-based classifier so that it classifies well under Gaussian noise, which is often infeasible in practice. In this paper, we propose a novel framework called PromptSmooth to achieve efficient certified robustness of Med-VLMs by leveraging the concept of prompt learning. Given any pre-trained MedVLM, PromptSmooth adapts it to handle Gaussian noise by learning textual prompts in a zero-shot or few-shot manner, achieving a delicate balance between accuracy and robustness, while minimizing the computational overhead. Moreover, PromptSmooth requires only a single model to handle multiple noise levels, which substantially reduces the computational cost compared to traditional methods that rely on training a separate model for each noise level. Comprehensive experiments based on three Med-VLMs and across six downstream datasets of various imaging modalities demonstrate the efficacy of PromptSmooth. Our code and models are available at https://github.com/nhussein/PromptSmooth.

引用

页码：698 / 708

页数：11

共 50 条

[31] Conceptual Codebook Learning for Vision-Language Models
Zhang, Yi
Yu, Ke
Wu, Siqi
He, Zhihai
COMPUTER VISION - ECCV 2024, PT LXXVII, 2024, 15135 : 235 - 251
[32] Exploring Vision-Language Models for Imbalanced Learning
Wang Y.
Yu Z.
Wang J.
Heng Q.
Chen H.
Ye W.
Xie R.
Xie X.
Zhang S.
International Journal of Computer Vision, 2024, 132 (01) : 224 - 237
[33] Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?
Wu, Cheng-En
Tian, Yu
Yu, Haichao
Wang, Heng
Morgado, Pedro
Hu, Yu Hen
Yang, Linjie
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15442 - 15451
[34] Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
Kan, Baoshuo
Wang, Teng
Lu, Wenpeng
Zhen, Xiantong
Guan, Weili
Zheng, Feng
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15624 - 15634
[35] Debiased Fine-Tuning for Vision-Language Models by Prompt Regularization
Zhu, Beier
Niu, Yulei
Lee, Saeil
Hur, Minhoe
Zhang, Hanwang
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3834 - 3842
[36] SwapPrompt: Test-Time Prompt Adaptation for Vision-Language Models
Ma, Xiaosong
Zhang, Jie
Guo, Song
Xu, Wenchao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[37] A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models
Jin, Woojeong
Cheng, Yu
Shen, Yelong
Chen, Weizhu
Ren, Xiang
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2763 - 2775
[38] Active Prompt Learning in Vision Language Models
Bang, Jihwan
Ahn, Sumyeong
Lee, Jae-Gil
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26994 - 27004
[39] UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with Vision-Language Models
Li, Xin
Behpour, Sima
Doan, Thang
He, Wenbin
Gou, Liang
Ren, Liu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[40] Learning with Enriched Inductive Biases for Vision-Language Models
Yang, Lingxiao
Zhang, Ru-Yuan
Chen, Qi
Xie, Xiaohua
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,

← 1 2 3 4 5 →