MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning

被引:8
|
作者
Xu, Xiaogang [1 ]
Zhao, Hengshuang [2 ,3 ]
Vineet, Vibhav [4 ]
Lim, Ser-Nam [5 ]
Torralba, Antonio [2 ]
机构
[1] CUHK, Hong Kong, Peoples R China
[2] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] HKU, Hong Kong, Peoples R China
[4] Microsoft Res, Redmond, WA USA
[5] Meta AI, New York, NY USA
来源
关键词
Multi-task learning; Transformer; Cross-task reasoning;
D O I
10.1007/978-3-031-19812-0_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we explore the advantages of utilizing transformer structures for addressing multi-task learning (MTL). Specifically, we demonstrate that models with transformer structures are more appropriate for MTL than convolutional neural networks (CNNs), and we propose a novel transformer-based architecture named MTFormer for MTL. In the framework, multiple tasks share the same transformer encoder and transformer decoder, and lightweight branches are introduced to harvest task-specific outputs, which increases the MTL performance and reduces the time-space complexity. Furthermore, information from different task domains can benefit each other, and we conduct cross-task reasoning. We propose a cross-task attention mechanism for further boosting the MTL results. The cross-task attention mechanism brings little parameters and computations while introducing extra performance improvements. Besides, we design a self-supervised cross-task contrastive learning algorithm for further boosting the MTL performance. Extensive experiments are conducted on two multi-task learning datasets, on which MTFormer achieves state-of-the-art results with limited network parameters and computations. It also demonstrates significant superiorities for few-shot learning and zero-shot learning.
引用
收藏
页码:304 / 321
页数:18
相关论文
共 50 条
  • [31] MULTI-TASK LEARNING WITH CROSS ATTENTION FOR KEYWORD SPOTTING
    Higuchil, Takuya
    Gupta, Anmol
    Dhir, Chandra
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 571 - 578
  • [32] Efficient and Effective Multi-task Grouping via Meta Learning on Task Combinations
    Song, Xiaozhuang
    Zheng, Shun
    Cao, Wei
    Yu, James J. Q.
    Bian, Jiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [33] Learning Multi-Level Task Groups in Multi-Task Learning
    Han, Lei
    Zhang, Yu
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2638 - 2644
  • [34] Multi-step Forecasting via Multi-task Learning
    Jawed, Shayan
    Rashed, Ahmed
    Schmidt-Thieme, Lars
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 790 - 799
  • [35] Fairness in Multi-Task Learning via Wasserstein Barycenters
    Hu, Francois
    Ratz, Philipp
    Charpentier, Arthur
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT II, 2023, 14170 : 295 - 312
  • [36] Wind Speed Forecasting via Multi-task Learning
    Lencione, Gabriel R.
    Von Zuben, Fernando J.
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [37] Improving Evidential Deep Learning via Multi-Task Learning
    Oh, Dongpin
    Shin, Bonggun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7895 - 7903
  • [38] Attribution of Adversarial Attacks via Multi-task Learning
    Guo, Zhongyi
    Han, Keji
    Ge, Yao
    Li, Yun
    Ji, Wei
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT II, 2024, 14448 : 81 - 94
  • [39] Efficient Multi-Task Learning via Generalist Recommender
    Wang, Luyang
    Tang, Cangcheng
    Zhang, Chongyang
    Ruan, Jun
    Huang, Kai
    Dai, Jason
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 4335 - 4339
  • [40] Modeling disease progression via multi-task learning
    Zhou, Jiayu
    Liu, Jun
    Narayan, Vaibhav A.
    Ye, Jieping
    NEUROIMAGE, 2013, 78 : 233 - 248