Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division

被引:0
|
作者
Liu, Junpeng [1 ]
Huang, Kaiyu [2 ]
Yu, Hao [1 ]
Li, Jiuyi [1 ]
Su, Jinsong [3 ]
Huang, Degen [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Tsinghua Univ, Inst AI Ind Res, Beijing, Peoples R China
[3] Xiamen Univ, Xiamen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A persistent goal of multilingual neural machine translation (MNMT) is to continually adapt the model to support new language pairs or improve some current language pairs without accessing the previous training data. To achieve this, the existing methods primarily focus on preventing catastrophic forgetting by making compromises between the original and new language pairs, leading to sub-optimal performance on both translation tasks. To mitigate this problem, we propose a dual importance-based model division method to divide the model parameters into two parts and separately model the translation of the original and new tasks. Specifically, we first remove the parameters that are negligible to the original tasks but essential to the new tasks to obtain a pruned model, which is responsible for the original translation tasks. Then we expand the pruned model with external parameters and fine-tune the newly added parameters with new training data. The whole fine-tuned model will be used for the new translation tasks. Experimental results show that our method can efficiently adapt the original model to various new translation tasks while retaining the performance of the original tasks. Further analyses demonstrate that our method consistently outperforms several strong baselines under different incremental translation scenarios. 1
引用
收藏
页码:12011 / 12027
页数:17
相关论文
共 50 条
  • [31] Competence-based Curriculum Learning for Neural Machine Translation
    Platanios, Emmanouil Antonios
    Stretcu, Otilia
    Neubig, Graham
    Poczos, Barnabas
    Mitchell, Tom M.
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1162 - 1172
  • [32] Inference Discrepancy Based Curriculum Learning for Neural Machine Translation
    Zhou, Lei
    Sasano, Ryohei
    Takeda, Koichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 135 - 143
  • [33] Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation
    He, Dan
    Minh-Quang Pham
    Thanh-Le Ha
    Turchi, Marco
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 654 - 670
  • [34] A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation
    Vazquez, Raul
    Raganato, Alessandro
    Creutz, Mathias
    Tiedemann, Jorg
    COMPUTATIONAL LINGUISTICS, 2020, 46 (02) : 387 - 424
  • [35] RETRACTED: Multilingual Machine Translation System Based on Decoder Recurrent Neural Network (Retracted Article)
    Deng, Fei
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [36] Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
    Xu, Weijia
    Agrawal, Sweta
    Briakou, Eleftheria
    Martindale, Marianna J.
    Carpuat, Marine
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 546 - 564
  • [37] Generative Adversarial Neural Machine Translation for Phonetic Languages via Reinforcement Learning
    Kumar, Amit
    Pratap, Ajay
    Singh, Anil Kumar
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 190 - 199
  • [38] Learning from Chunk-based Feedback in Neural Machine Translation
    Petrushkov, Pavel
    Khadivi, Shahram
    Matusov, Evgeny
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 326 - 331
  • [39] A Chinese-Malay Neural Machine Translation Model Based on CA-Transformer and Transfer Learning
    Zhan, Siqi
    Qin, Donghong
    Xu, Zhizhan
    Bao, Dongxue
    2022 IEEE THE 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND ARTIFICIAL INTELLIGENCE (BDAI 2022), 2022, : 13 - 18
  • [40] NmTHC: a hybrid error correction method based on a generative neural machine translation model with transfer learning
    Wang, Rongshu
    Chen, Jianhua
    BMC GENOMICS, 2024, 25 (01):