GDOD: Effective Gradient Descent using Orthogonal Decomposition for Multi-Task Learning

被引:1
|
作者
Dong, Xin [1 ]
Wu, Ruize [2 ]
Xiong, Chao [1 ]
Li, Hai [1 ]
Cheng, Lei [2 ]
He, Yong [2 ]
Qian, Shiyou [3 ]
Cao, Jian [3 ]
Mo, Linjian [1 ]
机构
[1] Ant Grp, Shanghai, Peoples R China
[2] Ant Grp, Hangzhou, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
关键词
multi-task learning; orthogonal decomposition; gradient conflict;
D O I
10.1145/3511808.3557333
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several tasks simultaneously. Some related work attributed the source of the problem is the conflicting gradients. In this case, it is needed to select useful gradient updates for all tasks carefully. To this end, we propose a novel optimization approach for MTL, named GDOD, which manipulates gradients of each task using an orthogonal basis decomposed from the span of all task gradients. GDOD decomposes gradients into task-shared and task-conflict components explicitly and adopts a general update rule for avoiding interference across all task gradients. This allows guiding the update directions depending on the task-shared components. Moreover, we prove the convergence of GDOD theoretically under both convex and non-convex assumptions. Experiment results on several multi-task datasets not only demonstrate the significant improvement of GDOD performed to existing MTL models but also prove that our algorithm outperforms state-of-the-art optimization methods in terms of AUC and Logloss metrics.
引用
收藏
页码:386 / 395
页数:10
相关论文
共 50 条
  • [1] Multi-task gradient descent for multi-task learning
    Bai, Lu
    Ong, Yew-Soon
    He, Tiantian
    Gupta, Abhishek
    [J]. MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
  • [2] Multi-task gradient descent for multi-task learning
    Lu Bai
    Yew-Soon Ong
    Tiantian He
    Abhishek Gupta
    [J]. Memetic Computing, 2020, 12 : 355 - 369
  • [3] Conflict-Averse Gradient Descent for Multi-task Learning
    Liu, Bo
    Liu, Xingchao
    Jin, Xiaojie
    Stone, Peter
    Liu, Qiang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Gradient Surgery for Multi-Task Learning
    Yu, Tianhe
    Kumar, Saurabh
    Gupta, Abhishek
    Levine, Sergey
    Hausman, Karol
    Finn, Chelsea
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] A Multiple Gradient Descent Design for Multi-Task Learning on Edge Computing: Multi-Objective Machine Learning Approach
    Zhou, Xiaojun
    Gao, Yuan
    Li, Chaojie
    Huang, Zhaoke
    [J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2022, 9 (01): : 121 - 133
  • [6] Drivetrain System Identification in a Multi-Task Learning Strategy using Partial Asynchronous Elastic Averaging Stochastic Gradient Descent
    Staessens, Tom
    Crevecoeur, Guillaume
    [J]. 2020 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2020, : 1549 - 1554
  • [7] Gradient Descent Decomposition for Multi-objective Learning
    Costa, Marcelo Azevedo
    Braga, Antonio Padua
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2011, 2011, 6936 : 377 - +
  • [8] Learned Weight Sharing for Deep Multi-Task Learning by Natural Evolution Strategy and Stochastic Gradient Descent
    Prellberg, Jonas
    Kramer, Oliver
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] Online Multi-Task Learning for Policy Gradient Methods
    Ammar, Haitham Bou
    Eaton, Eric
    Ruvolo, Paul
    Taylor, Matthew E.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1206 - 1214
  • [10] Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning
    Yoo, Minjong
    Cho, Sangwoo
    Woo, Honguk
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,