Task Adaptive Parameter Sharing for Multi-Task Learning

被引：26

作者：

Wallingford, Matthew ^{[1
,2
]}

Li, Hao ^{[2
]}

Achille, Alessandro ^{[2
]}

Ravichandran, Avinash ^{[2
]}

Fowlkes, Charless ^{[2
]}

Bhotika, Rahul ^{[2
]}

Soatto, Stefano ^{[2
]}

机构：

[1] Univ Washington, Seattle, WA 98195 USA

[2] AWS AI Labs, Seattle, WA 98109 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00741

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Adapting pre-trained models with broad capabilities has become standard practice for learning a wide range of downstream tasks. The typical approach of fine-tuning different models for each task is performant, but incurs a substantial memory cost. To efficiently learn multiple downstream tasks we introduce Task Adaptive Parameter Sharing (TAPS), a simple method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. This enables multi-task learning while minimizing the resources used and avoids catastrophic forgetting and competition between tasks. TAPS solves a joint optimization problem which determines both the layers that are shared with the base model and the value of the task-specific weights. Further, a sparsity penalty on the number of active layers promotes weight sharing with the base model. Compared to other methods, TAPS retains a high accuracy on the target tasks while still introducing only a small number of task-specific parameters. Moreover, TAPS is agnostic to the particular architecture used and requires only minor changes to the training scheme. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet,DenseNet,ViT) and show that it achieves state-of-the-art performance while being simple to implement.

引用

页码：7551 / 7560

页数：10

共 50 条

[1] Adaptive Hard Parameter Sharing Method Based on Multi-Task Deep Learning
Wang, Hongxia
Jin, Xiao
Du, Yukun
Zhang, Nan
Hao, Hongxia
[J]. MATHEMATICS, 2023, 11 (22)
[2] Fitting and sharing multi-task learning
Piao, Chengkai
Wei, Jinmao
[J]. APPLIED INTELLIGENCE, 2024, 54 (9-10) : 6918 - 6929
[3] ADAPTIVE AND ROBUST MULTI-TASK LEARNING
Duan, Yaqi
Wang, Kaizheng
[J]. ANNALS OF STATISTICS, 2023, 51 (05): : 2015 - 2039
[4] Multi-Task Learning Using BERT With Soft Parameter Sharing Between Layers
Pahari, Niraj
Shimada, Kazutaka
[J]. 2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
[5] Multi-task gradient descent for multi-task learning
Lu Bai
Yew-Soon Ong
Tiantian He
Abhishek Gupta
[J]. Memetic Computing, 2020, 12 : 355 - 369
[6] Multi-task gradient descent for multi-task learning
Bai, Lu
Ong, Yew-Soon
He, Tiantian
Gupta, Abhishek
[J]. MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
[7] Adaptive Smoothed Online Multi-Task Learning
Murugesan, Keerthiram
Liu, Hanxiao
Carbonell, Jaime
Yang, Yiming
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[8] Adaptive Dynamic Search for Multi-Task Learning
Kim, Eunwoo
[J]. APPLIED SCIENCES-BASEL, 2022, 12 (22):
[9] SNR: Sub-Network Routing for Flexible Parameter Sharing in Multi-Task Learning
Ma, Jiaqi
Zhao, Zhe
Chen, Jilin
Li, Ang
Hong, Lichan
Chi, Ed H.
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 216 - 223
[10] Adaptive Knowledge Sharing in Multi-Task Learning: Insights from Electricity Data Analysis
Chang, Yu-Hsiang
Ting, Lo Pang-Yun
Yin, Wei-Cheng
Su, Ko-Wei
Chuang, Kun-Ta
[J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2024 WORKSHOPS, RAFDA AND IWTA, 2024, 14658 : 148 - 160

← 1 2 3 4 5 →