Task Adaptive Parameter Sharing for Multi-Task Learning

被引：26

作者：

Wallingford, Matthew ^{[1
,2
]}

Li, Hao ^{[2
]}

Achille, Alessandro ^{[2
]}

Ravichandran, Avinash ^{[2
]}

Fowlkes, Charless ^{[2
]}

Bhotika, Rahul ^{[2
]}

Soatto, Stefano ^{[2
]}

机构：

[1] Univ Washington, Seattle, WA 98195 USA

[2] AWS AI Labs, Seattle, WA 98109 USA

来源：

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年

关键词：

D O I：

10.1109/CVPR52688.2022.00741

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Adapting pre-trained models with broad capabilities has become standard practice for learning a wide range of downstream tasks. The typical approach of fine-tuning different models for each task is performant, but incurs a substantial memory cost. To efficiently learn multiple downstream tasks we introduce Task Adaptive Parameter Sharing (TAPS), a simple method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. This enables multi-task learning while minimizing the resources used and avoids catastrophic forgetting and competition between tasks. TAPS solves a joint optimization problem which determines both the layers that are shared with the base model and the value of the task-specific weights. Further, a sparsity penalty on the number of active layers promotes weight sharing with the base model. Compared to other methods, TAPS retains a high accuracy on the target tasks while still introducing only a small number of task-specific parameters. Moreover, TAPS is agnostic to the particular architecture used and requires only minor changes to the training scheme. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet,DenseNet,ViT) and show that it achieves state-of-the-art performance while being simple to implement.

引用

页码：7551 / 7560

页数：10

共 50 条

[41] Learning to Branch for Multi-Task Learning
Guo, Pengsheng
Lee, Chen-Yu
Ulbricht, Daniel
[J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[42] Flexible multi-task learning with latent task grouping
Zhong, Shi
Pu, Jian
Jiang, Yu-Gang
Peng, Rui
Xue, Xiangyang
[J]. NEUROCOMPUTING, 2016, 189 : 179 - 188
[43] Efficiently Identifying Task Groupings for Multi-Task Learning
Fifty, Christopher
Amid, Ehsan
Zhao, Zhe
Yu, Tianhe
Anil, Rohan
Finn, Chelsea
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[44] Variable Selection and Task Grouping for Multi-Task Learning
Jeong, Jun-Yong
Jun, Chi-Hyuck
[J]. KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1589 - 1598
[45] Unsupervised Task Clustering for Multi-task Reinforcement Learning
Ackermann, Johannes
Richter, Oliver
Wattenhofer, Roger
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 222 - 237
[46] Adaptively sharing multi-levels of distributed representations in multi-task learning
Wang, Tianxin
Zhuang, Fuzhen
Sun, Ying
Zhang, Xiangliang
Lin, Leyu
Xia, Feng
He, Lei
He, Qing
[J]. INFORMATION SCIENCES, 2022, 591 : 226 - 234
[47] Attentive Task Interaction Network for Multi-Task Learning
Sinodinos, Dimitrios
Armanfard, Narges
[J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2885 - 2891
[48] Additive multi-task learning models and task diagnostics
Miller, Nikolay
Zhang, Guoyi
[J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 53 (12) : 6120 - 6137
[49] ADAPTIVE MULTI-TASK LEARNING FOR FINE-GRAINED CATEGORIZATION
Sun, Gang
Chen, Yanyun
Liu, Xuehui
Wu, Enhua
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 996 - 1000
[50] Adaptive User Engagement Evaluation via Multi-task Learning
Zamani, Hamed
Moradi, Pooya
Shakery, Azadeh
[J]. SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 1011 - 1014

← 1 2 3 4 5 →