Inherit or discard: learning better domain-specific child networks from the general domain for multi-domain NMT

被引:0
|
作者
Xu, Jinlei [1 ,2 ]
Wen, Yonghua [1 ,2 ]
Xiang, Yan [1 ,2 ]
Jiang, Shuting [1 ,2 ]
Huang, Yuxin [1 ,2 ]
Yu, Zhengtao [1 ,2 ]
机构
[1] Kunming Univ Sci & Technol, Sch Fac Informat Engn & Automat, Kunming 650500, Yunnan, Peoples R China
[2] Kunming Univ Sci & Technol, Yunnan Key Lab Artificial Intelligence, Kunming 650500, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-domain NMT; Parameter Interference; Parameter Inheritance; Gradient similarity;
D O I
10.1007/s13042-024-02253-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-domain NMT aims to develop a parameter-sharing model for translating general and specific domains, such as biology, legal, etc., which often struggle with the parameter interference problem. Existing approaches typically tackle this issue by learning a domain-specific sub-network for each domain equally, but they ignore the significant data imbalance problem across domains. For instance, the training data for the general domain often outweighs the biological domain tenfold. In this paper, we observe a natural similarity between the general and specific domains, including shared vocabulary or similar sentence structure. We propose a novel parameter inheritance strategy to adaptively learn domain-specific child networks from the general domain. Our approach employs gradient similarity as the criterion for determining which parameters should be inherited or discarded between the general and specific domains. Extensive experiments on several multi-domain NMT corpora demonstrate that our method significantly outperforms several strong baselines. In addition, our method exhibits remarkable generalization performance in adapting to few-shot multi-domain NMT scenarios. Further investigations reveal that our method achieves good interpretability because the parameters learned by the child network from the general domain depend on the interconnectedness between the specific domain and the general domain.
引用
收藏
页码:5439 / 5452
页数:14
相关论文
共 50 条
  • [1] Domain-specific Cognitive Models in a Multi-Domain Term Base
    Nahod, Bruno
    SUVREMENA LINGVISTIKA, 2015, 41 (80): : 105 - 128
  • [2] Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting
    Chen, Binghui
    Yan, Zhaoyi
    Li, Ke
    Li, Pengyu
    Wang, Biao
    Zuo, Wangmeng
    Zhang, Lei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16045 - 16055
  • [3] Model-Agnostic Multi-Domain Learning with Domain-Specific Adapters for Action Recognition
    Omi, Kazuki
    Kimata, Jun
    Tamaki, Toru
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (12) : 2119 - 2126
  • [4] Unsupervised Multi-Domain Image Translation with Domain-Specific Encoders/Decoders
    Hui, Le
    Li, Xiang
    Chen, Jiaxin
    He, Hongliang
    Yang, Jian
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2044 - 2049
  • [5] Towards an Infrastructure for Domain-Specific Languages in a Multi-domain Cloud Platform
    Goldschmidt, Thomas
    MODELLING FOUNDATIONS AND APPLICATIONS, ECMFA 2014, 2014, 8569 : 242 - 253
  • [6] Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks
    Wang, Yong
    Wang, Longyue
    Shi, Shuming
    Li, Victor O. K.
    Tu, Zhaopeng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9233 - 9241
  • [7] Domain-Specific Multi-Agent Dialog Policy Learning in Multi-Domain Task-Oriented Scenarios
    Tang, Li
    Si, Yuke
    Wang, Longbiao
    Dang, Jianwu
    INTERSPEECH 2021, 2021, : 256 - 260
  • [8] Domain-Specific Networks for Machine Learning
    Abts, Dennis
    2020 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS), 2020,
  • [9] Domain-general and domain-specific functional networks in working memory
    Li, Dawei
    Christ, Shawn E.
    Cowan, Nelson
    NEUROIMAGE, 2014, 102 : 646 - 656
  • [10] Domain-specific and domain-general constraints on word and sequence learning
    Archibald, Lisa M. D.
    Joanisse, Marc F.
    MEMORY & COGNITION, 2013, 41 (02) : 268 - 280