Visual-Language Prompt Tuning with Knowledge-guided Context Optimization

被引:59
|
作者
Yao, Hantao [1 ]
Zhang, Rui [2 ]
Xu, Changsheng [1 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence, Beijing, Peoples R China
[2] Chinese Acad Sci, State Key Lab Processors, Inst Comp Technol, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.00653
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt tuning is an effective way to adapt the pretrained visual-language model (VLM) to the downstream task using task-related textual tokens. Representative CoOp-based work combines the learnable textual tokens with the class tokens to obtain specific textual knowledge. However, the specific textual knowledge is worse generalization to the unseen classes because it forgets the essential general textual knowledge having a strong generalization ability. To tackle this issue, we introduce a novel Knowledge-guided Context Optimization (KgCoOp) to enhance the generalization ability of the learnable prompt for unseen classes. The key insight of KgCoOp is that the forgetting about essential knowledge can be alleviated by reducing the discrepancy between the learnable prompt and the hand-crafted prompt. Especially, KgCoOp minimizes the discrepancy between the textual embeddings generated by learned prompts and the hand-crafted prompts. Finally, adding the KgCoOp upon the contrastive loss can make a discriminative prompt for both seen and unseen tasks. Extensive evaluation of several benchmarks demonstrates that the proposed Knowledge-guided Context Optimization is an efficient method for prompt tuning, i.e., achieves better performance with less training time. code.
引用
收藏
页码:6757 / 6767
页数:11
相关论文
共 50 条
  • [21] Process Knowledge-Guided Autonomous Evolutionary Optimization for Constrained Multiobjective Problems
    Zuo, Mingcheng
    Gong, Dunwei
    Wang, Yan
    Ye, Xianming
    Zeng, Bo
    Meng, Fanlin
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2024, 28 (01) : 193 - 207
  • [22] Knowledge-enhanced visual-language pre-training on chest radiology images
    Zhang, Xiaoman
    Wu, Chaoyi
    Zhang, Ya
    Xie, Weidi
    Wang, Yanfeng
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [23] Knowledge-Guided Sentiment Analysis Via Learning From Natural Language Explanations
    Ke, Zunwang
    Sheng, Jiabao
    Li, Zhe
    Silamu, Wushour
    Guo, Qinglang
    IEEE ACCESS, 2021, 9 : 3570 - 3578
  • [24] Knowledge-guided evolutionary algorithm for multi-satellite resource scheduling optimization
    Yao, Xingyi
    Pan, Xiaogang
    Zhang, Tao
    Li, Wenhua
    Wang, Jianjiang
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 156 : 130 - 141
  • [25] Few-shot Incremental Learning with Textual-knowledge Embedding by Visual-language Model
    Yao H.-T.
    Yu L.
    Xu C.-S.
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (05): : 2101 - 2119
  • [26] ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling
    Oezsoy, Ege
    Pellegrini, Chantal
    Keicher, Matthias
    Navab, Nassir
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VI, 2024, 15006 : 455 - 465
  • [27] MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
    Li, Zhe
    Yang, Laurence T.
    Ren, Bocheng
    Nie, Xin
    Gao, Zhangyang
    Tan, Cheng
    Li, Stan Z.
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 11704 - 11714
  • [28] Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation
    Mal, Zongyang
    Luo, Guan
    Gao, Jin
    Li, Liang
    Chen, Yuxin
    Wang, Shaoru
    Zhang, Congxuan
    Hu, Weiming
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 14054 - 14063
  • [29] Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
    Kan, Baoshuo
    Wang, Teng
    Lu, Wenpeng
    Zhen, Xiantong
    Guan, Weili
    Zheng, Feng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15624 - 15634
  • [30] Dual Context-Guided Continuous Prompt Tuning for Few-Shot Learning
    Zhou, Jie
    Tian, Lei
    Yu, Houjin
    Zhou, Xiao
    Su, Hui
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 79 - 84