Efficient Transfer Learning for Visual Tasks via Continuous Optimization of Prompts

被引:3
|
作者
Conder, Jonathan [1 ]
Jefferson, Josephine [1 ]
Pages, Nathan [1 ]
Jawed, Khurram [1 ]
Nejati, Alireza [1 ]
Sagar, Mark [1 ]
机构
[1] Soul Machines, Auckland, New Zealand
关键词
Computer vision; Few-shot; Fine-tuning; Prompt engineering; Prefix-tuning; CLIP; Transformers; Vision transformers; LAND-USE; BENCHMARK; EUROSAT; DATASET;
D O I
10.1007/978-3-031-06427-2_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional methods for adapting pre-trained vision models to downstream tasks involve fine-tuning some or all of the model's parameters. There are a number of trade-offs with this approach. When too many parameters are fine-tuned, the model may lose the benefits associated with pre-training, such as the ability to generalize to out-of-distribution data. But, if instead too few parameters are fine-tuned, the model may be unable to adapt effectively for the tasks downstream. In this paper, we propose Visual Prompt Tuning (VPT) as an alternative to fine-tuning for Transformer-based vision models. Our method is closely related to, and inspired by, prefix-tuning of language models [22]. We find that, by adding additional parameters to a pre-trained model, VPT offers similar performance to fine-tuning the final layer. In addition, for low-data settings and for specialized tasks, such as traffic sign recognition, satellite photo recognition and handwriting classification, the performance of Transformer-based vision models is improved with the use of VPT.
引用
收藏
页码:297 / 309
页数:13
相关论文
共 50 条
  • [41] Learning to Discover Novel Visual Categories via Deep Transfer Clustering
    Han, Kai
    Vedaldi, Andrea
    Zisserman, Andrew
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8400 - 8408
  • [42] Self-explanation prompts in video learning: an optimization study
    Wang, Liu
    Xu, GuangTao
    EDUCATION AND INFORMATION TECHNOLOGIES, 2024, 29 (17) : 23441 - 23462
  • [43] RobustPrompt: Learning to defend against adversarial attacks with adaptive visual prompts
    Liu, Chang
    Xiang, Wenzhao
    Dong, Yinpeng
    Zhang, Xingxing
    Wang, Liyuan
    Duan, Ranjie
    Zheng, Shibao
    Su, Hang
    PATTERN RECOGNITION LETTERS, 2025, 190 : 161 - 168
  • [44] Optimization of visual tasks for detecting visual cortex activity in fMRI studies
    Mirzajani, A.
    Riyahi-Alam, N.
    Oghabian, M. A.
    Bakhtiary, M.
    Saberi, H.
    Firouznia, K.
    WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING 2006, VOL 14, PTS 1-6, 2007, 14 : 1388 - +
  • [45] Competitive reinforcement learning in continuous control tasks
    Abramson, M
    Pachowicz, P
    Wechsler, H
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1909 - 1914
  • [46] Efficient nonconvex sparse group feature selection via continuous and discrete optimization
    Xiang, Shuo
    Shen, Xiaotong
    Ye, Jieping
    ARTIFICIAL INTELLIGENCE, 2015, 224 : 28 - 50
  • [47] Two Steps Reinforcement Learning in Continuous Reinforcement Learning Tasks
    Lopez-Bueno, Ivan
    Garcia, Javier
    Fernandez, Fernando
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 577 - 584
  • [48] Fast and Memory Efficient Graph Optimization via ICM for Visual Place Recognition
    Schubert, Stefan
    Neubert, Peer
    Protzel, Peter
    ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
  • [49] Fast and Memory Efficient Graph Optimization via ICM for Visual Place Recognition
    Schubert, Stefan
    Neubert, Peer
    Protzel, Peter
    Robotics: Science and Systems, 2021,
  • [50] Scheduling Constrained Cloud Workflow Tasks via Evolutionary Multitasking Optimization With Adaptive Knowledge Transfer
    Zhou, Jiajun
    Gao, Liang
    Rao, Shijie
    Li, Yun
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 4254 - 4266