MDL-NAS: A Joint Multi-domain Learning Framework for Vision Transformer

被引:8
|
作者
Wang, Shiguang [1 ,3 ]
Xie, Tao [2 ,3 ]
Cheng, Jian [1 ]
Zhang, Xingcheng [3 ]
Liu, Haijun [4 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China
[2] Harbin Inst Technol, Harbin, Peoples R China
[3] SenseTime Res, Hong Kong, Peoples R China
[4] Chongqing Univ, Chongqing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.01924
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we introduce MDL-NAS, a unified framework that integrates multiple vision tasks into a manageable supernet and optimizes these tasks collectively under diverse dataset domains. MDL-NAS is storage-efficient since multiple models with a majority of shared parameters can be deposited into a single one. Technically, MDL-NAS constructs a coarse-to-fine search space, where the coarse search space offers various optimal architectures for different tasks while the fine search space provides fine-grained parameter sharing to tackle the inherent obstacles of multi-domain learning. In the fine search space, we suggest two parameter sharing policies, i.e., sequential sharing policy and mask sharing policy. Compared with previous works, such two sharing policies allow for the partial sharing and non-sharing of parameters at each layer of the network, hence attaining real fine-grained parameter sharing. Finally, we present a joint-subnet search algorithm that finds the optimal architecture and sharing parameters for each task within total resource constraints, challenging the traditional practice that downstream vision tasks are typically equipped with backbone networks designed for image classification. Experimentally, we demonstrate that MDL-NAS families fitted with non-hierarchical or hierarchical transformers deliver competitive performance for all tasks compared with state-of-the-art methods while maintaining efficient storage deployment and computation. We also demonstrate that MDL-NAS allows incremental learning and evades catastrophic forgetting when generalizing to a new task.
引用
收藏
页码:20094 / 20104
页数:11
相关论文
共 50 条
  • [1] MDViT: Multi-domain Vision Transformer for Small Medical Image Segmentation Datasets
    Du, Siyi
    Bayasi, Nourhan
    Hamarneh, Ghassan
    Garbi, Rafeef
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 448 - 458
  • [2] Combinatorial coverage framework for machine learning in multi-domain operations
    Cody, Tyler
    Kauffman, Justin
    Krometis, Justin
    Sobien, Dan
    Freeman, Laura
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS IV, 2022, 12113
  • [3] Joint multi-domain feature learning for image steganalysis based on CNN
    Wang, Ze
    Chen, Mingzhi
    Yang, Yu
    Lei, Min
    Dong, Zhexuan
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2020, 2020 (01)
  • [4] Joint multi-domain feature learning for image steganalysis based on CNN
    Wang, Ze
    Chen, Mingzhi
    Yang, Yu
    Lei, Min
    Dong, Zhexuan
    Eurasip Journal on Image and Video Processing, 2020, 2020 (01):
  • [5] Joint multi-domain feature learning for image steganalysis based on CNN
    Ze Wang
    Mingzhi Chen
    Yu Yang
    Min Lei
    Zhexuan Dong
    EURASIP Journal on Image and Video Processing, 2020
  • [6] JointGAN: Multi-Domain Joint Distribution Learning with Generative Adversarial Nets
    Pu, Yunchen
    Dai, Shuyang
    Gan, Zhe
    Wang, Weiyao
    Wang, Guoyin
    Zhang, Yizhe
    Henao, Ricardo
    Carin, Lawrence
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [7] ADAPTABLE MULTI-DOMAIN LANGUAGE MODEL FOR TRANSFORMER ASR
    Lee, Taewoo
    Lee, Min-Joong
    Kang, Tae Gyoon
    Jung, Seokyeoung
    Kwon, Minseok
    Hong, Yeona
    Lee, Jungin
    Woo, Kyoung-Gu
    Kim, Ho-Gyeong
    Jeong, Jiseung
    Lee, Jihyun
    Lee, Hosik
    Choi, Young Sang
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7358 - 7362
  • [8] Factorized Transformer for Multi-Domain Neural Machine Translation
    Deng, Yongchao
    Yu, Hongfei
    Yu, Heng
    Duan, Xiangyu
    Luo, Weihua
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4221 - 4230
  • [9] Unsupervised person re-identification via multi-domain joint learning
    Chen, Feng
    Wang, Nian
    Tang, Jun
    Yan, Pu
    Yu, Jun
    PATTERN RECOGNITION, 2023, 138
  • [10] Multi-Domain Active Learning for Recommendation
    Zhang, Zihan
    Jin, Xiaoming
    Li, Lianghao
    Ding, Guiguang
    Yang, Qiang
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2358 - 2364