Multi-Task Learning with Knowledge Distillation for Dense Prediction

被引:2
|
作者
Xu, Yangyang [1 ,2 ]
Yang, Yibo [4 ]
Zhang, Lefei [1 ,2 ,3 ]
机构
[1] Wuhan Univ, Inst Artificial Intelligence, Wuhan, Peoples R China
[2] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[3] Hubei Luojia Lab, Wuhan, Peoples R China
[4] King Abdullah Univ Sci & Technol, Jeddah, Saudi Arabia
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.01970
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While multi-task learning (MTL) has become an attractive topic, its training usually poses more difficulties than the single-task case. How to successfully apply knowledge distillation into MTL to improve training efficiency and model performance is still a challenging problem. In this paper, we introduce a new knowledge distillation procedure with an alternative match for MTL of dense prediction based on two simple design principles. First, for memory and training efficiency, we use a single strong multitask model as a teacher during training instead of multiple teachers, as widely adopted in existing studies. Second, we employ a less sensitive Cauchy-Schwarz (CS) divergence instead of the Kullback-Leibler (KL) divergence and propose a CS distillation loss accordingly. With the less sensitive divergence, our knowledge distillation with an alternative match is applied for capturing inter-task and intratask information between the teacher model and the student model of each task, thereby learning more "dark knowledge" for effective distillation. We conducted extensive experiments on dense prediction datasets, including NYUD-v2 and PASCAL-Context, for multiple vision tasks, such as semantic segmentation, human parts segmentation, depth estimation, surface normal estimation, and boundary detection. The results show that our proposed method decidedly improves model performance and the practical inference efficiency.
引用
收藏
页码:21493 / 21502
页数:10
相关论文
共 50 条
  • [1] Online Knowledge Distillation for Multi-task Learning
    Jacob, Geethu Miriam
    Agarwal, Vishal
    Stenger, Bjorn
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2358 - 2367
  • [2] Multi-Task Knowledge Distillation for Eye Disease Prediction
    Chelaramani, Sahil
    Gupta, Manish
    Agarwal, Vipul
    Gupta, Prashant
    Habash, Ranya
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3982 - 3992
  • [3] Multi-Task Learning for Dense Prediction Tasks: A Survey
    Vandenhende, Simon
    Georgoulis, Stamatios
    Van Gansbeke, Wouter
    Proesmans, Marc
    Dai, Dengxin
    Van Gool, Luc
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3614 - 3633
  • [4] Application of Knowledge Distillation to Multi-Task Speech Representation Learning
    Kerpicci, Mine
    Van Nguyen
    Zhang, Shuhua
    Visser, Erik
    INTERSPEECH 2023, 2023, : 2813 - 2817
  • [5] Multi-Task Learning With Multi-Query Transformer for Dense Prediction
    Xu, Yangyang
    Li, Xiangtai
    Yuan, Haobo
    Yang, Yibo
    Zhang, Lefei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 1228 - 1240
  • [6] Contrastive Multi-Task Dense Prediction
    Yang, Siwei
    Ye, Hanrong
    Xu, Dan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3190 - 3197
  • [7] Knowledge Distillation and Multi-task Feature Learning for Partial Discharge Recognition
    Ji, Jinsheng
    Shu, Zhou
    Li, Hongqun
    Lai, Kai Xian
    Zheng, Yuanjin
    Jiang, Xudong
    2023 IEEE 32ND CONFERENCE ON ELECTRICAL PERFORMANCE OF ELECTRONIC PACKAGING AND SYSTEMS, EPEPS, 2023,
  • [8] MULTI-TASK DISTILLATION: TOWARDS MITIGATING THE NEGATIVE TRANSFER IN MULTI-TASK LEARNING
    Meng, Ze
    Yao, Xin
    Sun, Lifeng
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 389 - 393
  • [9] DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction
    Xu, Yangyang
    Yang, Yibo
    Zhang, Lefei
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3072 - 3080
  • [10] Cross-Task Knowledge Distillation in Multi-Task Recommendation
    Yang, Chenxiao
    Pan, Junwei
    Gao, Xiaofeng
    Jiang, Tingyu
    Liu, Dapeng
    Chen, Guihai
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4318 - 4326