Efficient Transfer Learning for Visual Tasks via Continuous Optimization of Prompts

被引：3

作者：

Conder, Jonathan ^{[1
]}

Jefferson, Josephine ^{[1
]}

Pages, Nathan ^{[1
]}

Jawed, Khurram ^{[1
]}

Nejati, Alireza ^{[1
]}

Sagar, Mark ^{[1
]}

机构：

[1] Soul Machines, Auckland, New Zealand

来源：

IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT I | 2022年 / 13231卷

关键词：

Computer vision; Few-shot; Fine-tuning; Prompt engineering; Prefix-tuning; CLIP; Transformers; Vision transformers; LAND-USE; BENCHMARK; EUROSAT; DATASET;

D O I：

10.1007/978-3-031-06427-2_25

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Traditional methods for adapting pre-trained vision models to downstream tasks involve fine-tuning some or all of the model's parameters. There are a number of trade-offs with this approach. When too many parameters are fine-tuned, the model may lose the benefits associated with pre-training, such as the ability to generalize to out-of-distribution data. But, if instead too few parameters are fine-tuned, the model may be unable to adapt effectively for the tasks downstream. In this paper, we propose Visual Prompt Tuning (VPT) as an alternative to fine-tuning for Transformer-based vision models. Our method is closely related to, and inspired by, prefix-tuning of language models [22]. We find that, by adding additional parameters to a pre-trained model, VPT offers similar performance to fine-tuning the final layer. In addition, for low-data settings and for specialized tasks, such as traffic sign recognition, satellite photo recognition and handwriting classification, the performance of Transformer-based vision models is improved with the use of VPT.

引用

页码：297 / 309

页数：13

共 50 条

[41] Learning to Discover Novel Visual Categories via Deep Transfer Clustering
Han, Kai
Vedaldi, Andrea
Zisserman, Andrew
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8400 - 8408
[42] Self-explanation prompts in video learning: an optimization study
Wang, Liu
Xu, GuangTao
EDUCATION AND INFORMATION TECHNOLOGIES, 2024, 29 (17) : 23441 - 23462
[43] RobustPrompt: Learning to defend against adversarial attacks with adaptive visual prompts
Liu, Chang
Xiang, Wenzhao
Dong, Yinpeng
Zhang, Xingxing
Wang, Liyuan
Duan, Ranjie
Zheng, Shibao
Su, Hang
PATTERN RECOGNITION LETTERS, 2025, 190 : 161 - 168
[44] Optimization of visual tasks for detecting visual cortex activity in fMRI studies
Mirzajani, A.
Riyahi-Alam, N.
Oghabian, M. A.
Bakhtiary, M.
Saberi, H.
Firouznia, K.
WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING 2006, VOL 14, PTS 1-6, 2007, 14 : 1388 - +
[45] Competitive reinforcement learning in continuous control tasks
Abramson, M
Pachowicz, P
Wechsler, H
PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1909 - 1914
[46] Efficient nonconvex sparse group feature selection via continuous and discrete optimization
Xiang, Shuo
Shen, Xiaotong
Ye, Jieping
ARTIFICIAL INTELLIGENCE, 2015, 224 : 28 - 50
[47] Two Steps Reinforcement Learning in Continuous Reinforcement Learning Tasks
Lopez-Bueno, Ivan
Garcia, Javier
Fernandez, Fernando
BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 577 - 584
[48] Fast and Memory Efficient Graph Optimization via ICM for Visual Place Recognition
Schubert, Stefan
Neubert, Peer
Protzel, Peter
ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
[49] Fast and Memory Efficient Graph Optimization via ICM for Visual Place Recognition
Schubert, Stefan
Neubert, Peer
Protzel, Peter
Robotics: Science and Systems, 2021,
[50] Scheduling Constrained Cloud Workflow Tasks via Evolutionary Multitasking Optimization With Adaptive Knowledge Transfer
Zhou, Jiajun
Gao, Liang
Rao, Shijie
Li, Yun
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 4254 - 4266

← 1 2 3 4 5 →