JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models

被引：0

作者：

Guo, Yuncheng ^{[1
]}

Guo, Xiaodong ^{[1
]}

机构：

[1] Fudan Univ, Dept Elect Engn, Shanghai 200438, Peoples R China

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52733.2024.02711

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Leveraging few-shot datasets in prompt learning for Vision-Language Models eliminates the need for manual prompt engineering while highlighting the necessity of accurate annotations for the labels. However, high-level or complex label noise challenges prompt learning for Vision-Language Models. Aiming at this issue, we propose a new framework for improving its robustness. Specifically, we introduce the Joint Adaptive Partitioning for Label Refurbishment (JoAPR), a structured framework encompassing two key steps. 1) Data Partitioning, where we differentiate between clean and noisy data using joint adaptive thresholds. 2) Label Refurbishment, where we correct the labels based on the partition outcomes before retraining the network. Our comprehensive experiments confirm that JoAPR substantially enhances the robustness of prompt learning for Vision-Language Models against label noise, offering a promising direction for future research.

引用

页码：28695 / 28705

页数：11

共 50 条

[31] A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models
Jin, Woojeong
Cheng, Yu
Shen, Yelong
Chen, Weizhu
Ren, Xiang
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2763 - 2775
[32] Active Prompt Learning in Vision Language Models
Bang, Jihwan
Ahn, Sumyeong
Lee, Jae-Gil
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 26994 - 27004
[33] UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with Vision-Language Models
Li, Xin
Behpour, Sima
Doan, Thang
He, Wenbin
Gou, Liang
Ren, Liu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[34] Learning with Enriched Inductive Biases for Vision-Language Models
Yang, Lingxiao
Zhang, Ru-Yuan
Chen, Qi
Xie, Xiaohua
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
[35] CTPT: Continual Test-time Prompt Tuning for vision-language models
Wang, Fan
Han, Zhongyi
Liu, Xingbo
Yin, Yilong
Gao, Xin
PATTERN RECOGNITION, 2025, 161
[36] UMPA: Unified multi-modal prompt with adapter for vision-language models
Jin, Zhengwei
Wei, Yun
MULTIMEDIA SYSTEMS, 2025, 31 (02)
[37] CPT: Colorful Prompt Tuning for pre-trained vision-language models
Yao, Yuan
Zhang, Ao
Zhang, Zhengyan
Liu, Zhiyuan
Chua, Tat-Seng
Sun, Maosong
AI OPEN, 2024, 5 : 30 - 38
[38] Prompt-guided and multimodal landscape scenicness assessments with vision-language models
Levering, Alex
Marcos, Diego
Jacobs, Nathan
Tuia, Devis
PLOS ONE, 2024, 19 (09):
[39] Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
Du, Yu
Wei, Fangyun
Zhang, Zihe
Shi, Miaojing
Gao, Yue
Li, Guoqi
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14064 - 14073
[40] Prompt-Ladder: Memory-efficient prompt tuning for vision-language models on edge devices
Cai, Siqi
Liu, Xuan
Yuan, Jingling
Zhou, Qihua
PATTERN RECOGNITION, 2025, 163

← 1 2 3 4 5 →