Task Adaptive Parameter Sharing for Multi-Task Learning

被引:26
|
作者
Wallingford, Matthew [1 ,2 ]
Li, Hao [2 ]
Achille, Alessandro [2 ]
Ravichandran, Avinash [2 ]
Fowlkes, Charless [2 ]
Bhotika, Rahul [2 ]
Soatto, Stefano [2 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] AWS AI Labs, Seattle, WA 98109 USA
关键词
D O I
10.1109/CVPR52688.2022.00741
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adapting pre-trained models with broad capabilities has become standard practice for learning a wide range of downstream tasks. The typical approach of fine-tuning different models for each task is performant, but incurs a substantial memory cost. To efficiently learn multiple downstream tasks we introduce Task Adaptive Parameter Sharing (TAPS), a simple method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. This enables multi-task learning while minimizing the resources used and avoids catastrophic forgetting and competition between tasks. TAPS solves a joint optimization problem which determines both the layers that are shared with the base model and the value of the task-specific weights. Further, a sparsity penalty on the number of active layers promotes weight sharing with the base model. Compared to other methods, TAPS retains a high accuracy on the target tasks while still introducing only a small number of task-specific parameters. Moreover, TAPS is agnostic to the particular architecture used and requires only minor changes to the training scheme. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet,DenseNet,ViT) and show that it achieves state-of-the-art performance while being simple to implement.
引用
收藏
页码:7551 / 7560
页数:10
相关论文
共 50 条
  • [1] Adaptive Hard Parameter Sharing Method Based on Multi-Task Deep Learning
    Wang, Hongxia
    Jin, Xiao
    Du, Yukun
    Zhang, Nan
    Hao, Hongxia
    [J]. MATHEMATICS, 2023, 11 (22)
  • [2] Fitting and sharing multi-task learning
    Piao, Chengkai
    Wei, Jinmao
    [J]. APPLIED INTELLIGENCE, 2024, 54 (9-10) : 6918 - 6929
  • [3] ADAPTIVE AND ROBUST MULTI-TASK LEARNING
    Duan, Yaqi
    Wang, Kaizheng
    [J]. ANNALS OF STATISTICS, 2023, 51 (05): : 2015 - 2039
  • [4] Multi-Task Learning Using BERT With Soft Parameter Sharing Between Layers
    Pahari, Niraj
    Shimada, Kazutaka
    [J]. 2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,
  • [5] Multi-task gradient descent for multi-task learning
    Lu Bai
    Yew-Soon Ong
    Tiantian He
    Abhishek Gupta
    [J]. Memetic Computing, 2020, 12 : 355 - 369
  • [6] Multi-task gradient descent for multi-task learning
    Bai, Lu
    Ong, Yew-Soon
    He, Tiantian
    Gupta, Abhishek
    [J]. MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
  • [7] Adaptive Smoothed Online Multi-Task Learning
    Murugesan, Keerthiram
    Liu, Hanxiao
    Carbonell, Jaime
    Yang, Yiming
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [8] Adaptive Dynamic Search for Multi-Task Learning
    Kim, Eunwoo
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [9] SNR: Sub-Network Routing for Flexible Parameter Sharing in Multi-Task Learning
    Ma, Jiaqi
    Zhao, Zhe
    Chen, Jilin
    Li, Ang
    Hong, Lichan
    Chi, Ed H.
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 216 - 223
  • [10] Adaptive Knowledge Sharing in Multi-Task Learning: Insights from Electricity Data Analysis
    Chang, Yu-Hsiang
    Ting, Lo Pang-Yun
    Yin, Wei-Cheng
    Su, Ko-Wei
    Chuang, Kun-Ta
    [J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2024 WORKSHOPS, RAFDA AND IWTA, 2024, 14658 : 148 - 160