GDOD: Effective Gradient Descent using Orthogonal Decomposition for Multi-Task Learning

被引:1
|
作者
Dong, Xin [1 ]
Wu, Ruize [2 ]
Xiong, Chao [1 ]
Li, Hai [1 ]
Cheng, Lei [2 ]
He, Yong [2 ]
Qian, Shiyou [3 ]
Cao, Jian [3 ]
Mo, Linjian [1 ]
机构
[1] Ant Grp, Shanghai, Peoples R China
[2] Ant Grp, Hangzhou, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
关键词
multi-task learning; orthogonal decomposition; gradient conflict;
D O I
10.1145/3511808.3557333
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several tasks simultaneously. Some related work attributed the source of the problem is the conflicting gradients. In this case, it is needed to select useful gradient updates for all tasks carefully. To this end, we propose a novel optimization approach for MTL, named GDOD, which manipulates gradients of each task using an orthogonal basis decomposed from the span of all task gradients. GDOD decomposes gradients into task-shared and task-conflict components explicitly and adopts a general update rule for avoiding interference across all task gradients. This allows guiding the update directions depending on the task-shared components. Moreover, we prove the convergence of GDOD theoretically under both convex and non-convex assumptions. Experiment results on several multi-task datasets not only demonstrate the significant improvement of GDOD performed to existing MTL models but also prove that our algorithm outperforms state-of-the-art optimization methods in terms of AUC and Logloss metrics.
引用
收藏
页码:386 / 395
页数:10
相关论文
共 50 条
  • [21] Gradient Coordination for Quantifying and Maximizing Knowledge Transference in Multi-Task Learning
    Yang, Xuanhua
    Zhao, Jianxin
    Liu, Shaoguo
    Wang, Liang
    Zheng, Bo
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2032 - 2036
  • [22] Spatio-Temporal Multi-Task Learning via Tensor Decomposition
    Xu, Jianpeng
    Zhou, Jiayu
    Tan, Pang-Ning
    Liu, Xi
    Luo, Lifeng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2764 - 2775
  • [23] Learning to Branch for Multi-Task Learning
    Guo, Pengsheng
    Lee, Chen-Yu
    Ulbricht, Daniel
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [24] Learning to Branch for Multi-Task Learning
    Guo, Pengsheng
    Lee, Chen-Yu
    Ulbricht, Daniel
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [25] Boosted multi-task learning
    Olivier Chapelle
    Pannagadatta Shivaswamy
    Srinivas Vadrevu
    Kilian Weinberger
    Ya Zhang
    Belle Tseng
    Machine Learning, 2011, 85 : 149 - 173
  • [26] An overview of multi-task learning
    Zhang, Yu
    Yang, Qiang
    NATIONAL SCIENCE REVIEW, 2018, 5 (01) : 30 - 43
  • [27] On Partial Multi-Task Learning
    He, Yi
    Wu, Baijun
    Wu, Di
    Wu, Xindong
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1174 - 1181
  • [28] Federated Multi-Task Learning
    Smith, Virginia
    Chiang, Chao-Kai
    Sanjabi, Maziar
    Talwalkar, Ameet
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [29] Pareto Multi-Task Learning
    Lin, Xi
    Zhen, Hui-Ling
    Li, Zhenhua
    Zhang, Qingfu
    Kwong, Sam
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [30] Asynchronous Multi-Task Learning
    Baytas, Inci M.
    Yan, Ming
    Jain, Anil K.
    Zhou, Jiayu
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 11 - 20