Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division

被引:0
|
作者
Liu, Junpeng [1 ]
Huang, Kaiyu [2 ]
Yu, Hao [1 ]
Li, Jiuyi [1 ]
Su, Jinsong [3 ]
Huang, Degen [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Tsinghua Univ, Inst AI Ind Res, Beijing, Peoples R China
[3] Xiamen Univ, Xiamen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A persistent goal of multilingual neural machine translation (MNMT) is to continually adapt the model to support new language pairs or improve some current language pairs without accessing the previous training data. To achieve this, the existing methods primarily focus on preventing catastrophic forgetting by making compromises between the original and new language pairs, leading to sub-optimal performance on both translation tasks. To mitigate this problem, we propose a dual importance-based model division method to divide the model parameters into two parts and separately model the translation of the original and new tasks. Specifically, we first remove the parameters that are negligible to the original tasks but essential to the new tasks to obtain a pruned model, which is responsible for the original translation tasks. Then we expand the pruned model with external parameters and fine-tune the newly added parameters with new training data. The whole fine-tuned model will be used for the new translation tasks. Experimental results show that our method can efficiently adapt the original model to various new translation tasks while retaining the performance of the original tasks. Further analyses demonstrate that our method consistently outperforms several strong baselines under different incremental translation scenarios. 1
引用
收藏
页码:12011 / 12027
页数:17
相关论文
共 50 条
  • [21] On Learning Meaningful Code Changes via Neural Machine Translation
    Tufano, Michele
    Pantiuchina, Jevgenija
    Watson, Cody
    Bavota, Gabriele
    Poshyvanyk, Denys
    2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, : 25 - 36
  • [22] Unpaired Multimodal Neural Machine Translation via Reinforcement Learning
    Wang, Yijun
    Wei, Tianxin
    Liu, Qi
    Chen, Enhong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 168 - 185
  • [23] FedIMP: Parameter Importance-based Model Poisoning attack against Federated learning system
    Li, Xuan
    Wang, Naiyu
    Yuan, Shuai
    Guan, Zhitao
    COMPUTERS & SECURITY, 2024, 144
  • [24] Dual Transfer Learning for Neural Machine Translation with Marginal Distribution Regularization
    Wang, Yijun
    Xia, Yingce
    Zhao, Li
    Bian, Jiang
    Qin, Tao
    Liu, Guiquan
    Liu, Tie-Yan
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5553 - 5560
  • [25] Chinese-English-Burmese neural machine translation based on multilingual joint training
    Man Z.
    Mao C.
    Yu Z.
    Li X.
    Gao S.
    Zhu J.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2021, 61 (09): : 927 - 935
  • [26] Addressing domain shift in neural machine translation via reinforcement learning
    Kumar, Amit
    Pratap, Ajay
    Singh, Anil Kumar
    Saha, Sriparna
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 201
  • [27] Research on Machine Translation Model Based on Neural Network
    Han, Zhuoran
    Li, Shenghong
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 244 - 251
  • [28] Reinforcement Learning based Curriculum Optimization for Neural Machine Translation
    Kumar, Gaurav
    Foster, George
    Cherry, Colin
    Krikun, Maxim
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2054 - 2061
  • [29] Norm-Based Curriculum Learning for Neural Machine Translation
    Liu, Xuebo
    Lai, Houtim
    Wong, Derek F.
    Chao, Lidia S.
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 427 - 436
  • [30] Learning Confidence for Transformer-based Neural Machine Translation
    Lu, Yu
    Zeng, Jiali
    Zhang, Jiajun
    Wu, Shuangzhi
    Li, Mu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2353 - 2364