Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division

被引:0
|
作者
Liu, Junpeng [1 ]
Huang, Kaiyu [2 ]
Yu, Hao [1 ]
Li, Jiuyi [1 ]
Su, Jinsong [3 ]
Huang, Degen [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Tsinghua Univ, Inst AI Ind Res, Beijing, Peoples R China
[3] Xiamen Univ, Xiamen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A persistent goal of multilingual neural machine translation (MNMT) is to continually adapt the model to support new language pairs or improve some current language pairs without accessing the previous training data. To achieve this, the existing methods primarily focus on preventing catastrophic forgetting by making compromises between the original and new language pairs, leading to sub-optimal performance on both translation tasks. To mitigate this problem, we propose a dual importance-based model division method to divide the model parameters into two parts and separately model the translation of the original and new tasks. Specifically, we first remove the parameters that are negligible to the original tasks but essential to the new tasks to obtain a pruned model, which is responsible for the original translation tasks. Then we expand the pruned model with external parameters and fine-tune the newly added parameters with new training data. The whole fine-tuned model will be used for the new translation tasks. Experimental results show that our method can efficiently adapt the original model to various new translation tasks while retaining the performance of the original tasks. Further analyses demonstrate that our method consistently outperforms several strong baselines under different incremental translation scenarios. 1
引用
收藏
页码:12011 / 12027
页数:17
相关论文
共 50 条
  • [1] Importance-Based Neuron Selective Distillation for Interference Mitigation in Multilingual Neural Machine Translation
    Zhang, Jiarui
    Huang, Heyan
    Hu, Yue
    Guo, Ping
    Xie, Yuqiang
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2023, 2023, 14120 : 140 - 150
  • [2] Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution
    Garcia, Xavier
    Constant, Noah
    Parikh, Ankur P.
    Firat, Orhan
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1184 - 1192
  • [3] Continual Learning for Neural Machine Translation
    Cao, Yue
    Wei, Hao-Ran
    Chen, Boxing
    Wan, Xiaojun
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3964 - 3974
  • [4] On the shortcut learning in multilingual neural machine translation
    Wang, Wenxuan
    Jiao, Wenxiang
    Huang, Jen-tse
    Tu, Zhaopeng
    Lyu, Michael
    NEUROCOMPUTING, 2025, 615
  • [5] Learn and Consolidate: Continual Adaptation for Zero-Shot and Multilingual Neural Machine Translation
    Huang, Kaiyu
    Li, Peng
    Liu, Junpeng
    Sung, Maosong
    Liu, Yang
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13938 - 13951
  • [6] Efficient Recurrent Neural Networks via Importance-Based Sparsification
    Ren, Jiankang
    Ni, Zheng
    Su, Xiaoyan
    Zhang, Haijun
    Li, Haifang
    Li, Shengyu
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2024, 33 (15)
  • [7] Knowledge Transfer in Incremental Learning for Multilingual Neural Machine Translation
    Huang, Kaiyu
    Li, Peng
    Ma, Jin
    Yao, Ting
    Liu, Yang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15286 - 15304
  • [8] Multi-task Learning for Multilingual Neural Machine Translation
    Wang, Yiren
    Zhai, ChengXiang
    Awadalla, Hany Hassan
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1022 - 1034
  • [9] Parameter Differentiation Based Multilingual Neural Machine Translation
    Wang, Qian
    Zhang, Jiajun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11440 - 11448
  • [10] Multilingual Pre-training Model-Assisted Contrastive Learning Neural Machine Translation
    Sun, Shuo
    Hou, Hong-xu
    Yang, Zong-heng
    Wang, Yi-song
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,