GDOD: Effective Gradient Descent using Orthogonal Decomposition for Multi-Task Learning

被引：1

作者：

Dong, Xin ^{[1
]}

Wu, Ruize ^{[2
]}

Xiong, Chao ^{[1
]}

Li, Hai ^{[1
]}

Cheng, Lei ^{[2
]}

He, Yong ^{[2
]}

Qian, Shiyou ^{[3
]}

Cao, Jian ^{[3
]}

Mo, Linjian ^{[1
]}

机构：

[1] Ant Grp, Shanghai, Peoples R China

[2] Ant Grp, Hangzhou, Peoples R China

[3] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022 | 2022年

关键词：

multi-task learning; orthogonal decomposition; gradient conflict;

D O I：

10.1145/3511808.3557333

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several tasks simultaneously. Some related work attributed the source of the problem is the conflicting gradients. In this case, it is needed to select useful gradient updates for all tasks carefully. To this end, we propose a novel optimization approach for MTL, named GDOD, which manipulates gradients of each task using an orthogonal basis decomposed from the span of all task gradients. GDOD decomposes gradients into task-shared and task-conflict components explicitly and adopts a general update rule for avoiding interference across all task gradients. This allows guiding the update directions depending on the task-shared components. Moreover, we prove the convergence of GDOD theoretically under both convex and non-convex assumptions. Experiment results on several multi-task datasets not only demonstrate the significant improvement of GDOD performed to existing MTL models but also prove that our algorithm outperforms state-of-the-art optimization methods in terms of AUC and Logloss metrics.

引用

页码：386 / 395

页数：10

共 50 条

[21] Gradient Coordination for Quantifying and Maximizing Knowledge Transference in Multi-Task Learning
Yang, Xuanhua
Zhao, Jianxin
Liu, Shaoguo
Wang, Liang
Zheng, Bo
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2032 - 2036
[22] Spatio-Temporal Multi-Task Learning via Tensor Decomposition
Xu, Jianpeng
Zhou, Jiayu
Tan, Pang-Ning
Liu, Xi
Luo, Lifeng
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2764 - 2775
[23] Learning to Branch for Multi-Task Learning
Guo, Pengsheng
Lee, Chen-Yu
Ulbricht, Daniel
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[24] Learning to Branch for Multi-Task Learning
Guo, Pengsheng
Lee, Chen-Yu
Ulbricht, Daniel
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[25] Boosted multi-task learning
Olivier Chapelle
Pannagadatta Shivaswamy
Srinivas Vadrevu
Kilian Weinberger
Ya Zhang
Belle Tseng
Machine Learning, 2011, 85 : 149 - 173
[26] An overview of multi-task learning
Zhang, Yu
Yang, Qiang
NATIONAL SCIENCE REVIEW, 2018, 5 (01) : 30 - 43
[27] On Partial Multi-Task Learning
He, Yi
Wu, Baijun
Wu, Di
Wu, Xindong
ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1174 - 1181
[28] Federated Multi-Task Learning
Smith, Virginia
Chiang, Chao-Kai
Sanjabi, Maziar
Talwalkar, Ameet
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[29] Pareto Multi-Task Learning
Lin, Xi
Zhen, Hui-Ling
Li, Zhenhua
Zhang, Qingfu
Kwong, Sam
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[30] Asynchronous Multi-Task Learning
Baytas, Inci M.
Yan, Ming
Jain, Anil K.
Zhou, Jiayu
2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2016, : 11 - 20

← 1 2 3 4 5 →