Learning to Learn Better Visual Prompts

被引:0
|
作者
Wang, Fengxiang [1 ]
Huang, Wanrong [1 ]
Yang, Shaowu [1 ]
Qi, Fan [2 ]
Lan, Long [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp Sci & Technol, HPCL, Changsha, Hunan, Peoples R China
[2] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt tuning provides a low-cost way of adapting vision-language models (VLMs) for various downstream vision tasks without requiring updating the huge pre-trained parameters. Dispensing with the conventional manual crafting of prompts, the recent prompt tuning method of Context Optimization (CoOp) introduces adaptable vectors as text prompts. Nevertheless, several previous works point out that the CoOp-based approaches are easy to overfit to the base classes and hard to generalize to novel classes. In this paper, we reckon that the prompt tuning works well only in the base classes because of the limited capacity of the adaptable vectors. In addition, the scale of the pre-trained model is a hundred times the scale of the adaptable vector, thus the learned vector has a very limited ability to absorb the knowledge of novel classes. To minimize this excessive overfitting of textual knowledge on the base class, we view prompt tuning as learning to learn (LoL) and learn the prompt in the way of meta-learning, the training manner of dividing the base classes into many different subclasses could fully exert the limited capacity of prompt tuning and thus transfer its power to recognize the novel classes. To be specific, we initially perform fine-tuning on the base class based on the CoOp method for pre-trained CLIP. Subsequently, predicated on the fine-tuned CLIP model, we carry out further fine-tuning in an N-way K-shot manner from the perspective of meta-learning on the base classes. We finally apply the learned textual vector and VLM for unseen classes. Extensive experiments on benchmark datasets validate the efficacy of our meta-learning-informed prompt tuning, affirming its role as a robust optimization strategy for VLMs.
引用
收藏
页码:5354 / 5363
页数:10
相关论文
共 50 条
  • [21] Better Learn to Live with It
    Shorewood, Jackson
    TECHNOLOGY REVIEW, 2016, 119 (04) : 8 - 8
  • [22] BE A BETTER TEACHER: LEARN HOW STUDENTS LEARN
    Gregory Lee
    山东外语教学, 1993, (03) : 77 - 79
  • [23] Learn from each other to Classify better: Cross-layer mutual attention learning for fine-grained visual classification
    Liu, Dichao
    Zhao, Longjiao
    Wang, Yu
    Kato, Jien
    PATTERN RECOGNITION, 2023, 140
  • [24] ACCIDENT PROMPTS BETTER MINE SAFETY RULES
    MACKAY, BB
    MANAGEMENT OF WORLD WASTES, 1984, 27 (02): : 36 - 37
  • [25] Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice
    Li, Dan
    MINDS AND MACHINES, 2023, 33 (03) : 429 - 450
  • [26] Machines Learn Better with Better Data Ontology: Lessons from Philosophy of Induction and Machine Learning Practice
    Dan Li
    Minds and Machines, 2023, 33 : 429 - 450
  • [27] PromptFL Let Federated Participants Cooperatively Learn Prompts Instead of Models - Federated Learning in Age of Foundation Model
    Guo, Tao
    Guo, Song
    Wang, Junxiao
    Tang, Xueyang
    Xu, Wenchao
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (05) : 5179 - 5194
  • [28] A hybrid learning approach for better recognition of visual objects
    Imam, IF
    Gutta, S
    PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, 1996, : 1104 - 1109
  • [29] Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation
    Tu, Tao
    Ping, Qing
    Thattai, Govindarajan
    Tur, Gokhan
    Natarajan, Prem
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5618 - 5627
  • [30] Testing Prepares Students to Learn Better: The Forward Effect of Testing in Category Learning
    Lee, Hee Seung
    Ahn, Dahwi
    JOURNAL OF EDUCATIONAL PSYCHOLOGY, 2018, 110 (02) : 203 - 217