Continual Learning for Multilingual Neural Machine Translation via Dual Importance-based Model Division

被引：0

作者：

Liu, Junpeng ^{[1
]}

Huang, Kaiyu ^{[2
]}

Yu, Hao ^{[1
]}

Li, Jiuyi ^{[1
]}

Su, Jinsong ^{[3
]}

Huang, Degen ^{[1
]}

机构：

[1] Dalian Univ Technol, Dalian, Peoples R China

[2] Tsinghua Univ, Inst AI Ind Res, Beijing, Peoples R China

[3] Xiamen Univ, Xiamen, Peoples R China

来源：

2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A persistent goal of multilingual neural machine translation (MNMT) is to continually adapt the model to support new language pairs or improve some current language pairs without accessing the previous training data. To achieve this, the existing methods primarily focus on preventing catastrophic forgetting by making compromises between the original and new language pairs, leading to sub-optimal performance on both translation tasks. To mitigate this problem, we propose a dual importance-based model division method to divide the model parameters into two parts and separately model the translation of the original and new tasks. Specifically, we first remove the parameters that are negligible to the original tasks but essential to the new tasks to obtain a pruned model, which is responsible for the original translation tasks. Then we expand the pruned model with external parameters and fine-tune the newly added parameters with new training data. The whole fine-tuned model will be used for the new translation tasks. Experimental results show that our method can efficiently adapt the original model to various new translation tasks while retaining the performance of the original tasks. Further analyses demonstrate that our method consistently outperforms several strong baselines under different incremental translation scenarios. 1

引用

页码：12011 / 12027

页数：17

共 50 条

[31] Competence-based Curriculum Learning for Neural Machine Translation
Platanios, Emmanouil Antonios
Stretcu, Otilia
Neubig, Graham
Poczos, Barnabas
Mitchell, Tom M.
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1162 - 1172
[32] Inference Discrepancy Based Curriculum Learning for Neural Machine Translation
Zhou, Lei
Sasano, Ryohei
Takeda, Koichi
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 135 - 143
[33] Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation
He, Dan
Minh-Quang Pham
Thanh-Le Ha
Turchi, Marco
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 654 - 670
[34] A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation
Vazquez, Raul
Raganato, Alessandro
Creutz, Mathias
Tiedemann, Jorg
COMPUTATIONAL LINGUISTICS, 2020, 46 (02) : 387 - 424
[35] RETRACTED: Multilingual Machine Translation System Based on Decoder Recurrent Neural Network (Retracted Article)
Deng, Fei
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[36] Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
Xu, Weijia
Agrawal, Sweta
Briakou, Eleftheria
Martindale, Marianna J.
Carpuat, Marine
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 546 - 564
[37] Generative Adversarial Neural Machine Translation for Phonetic Languages via Reinforcement Learning
Kumar, Amit
Pratap, Ajay
Singh, Anil Kumar
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (01): : 190 - 199
[38] Learning from Chunk-based Feedback in Neural Machine Translation
Petrushkov, Pavel
Khadivi, Shahram
Matusov, Evgeny
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 326 - 331
[39] A Chinese-Malay Neural Machine Translation Model Based on CA-Transformer and Transfer Learning
Zhan, Siqi
Qin, Donghong
Xu, Zhizhan
Bao, Dongxue
2022 IEEE THE 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND ARTIFICIAL INTELLIGENCE (BDAI 2022), 2022, : 13 - 18
[40] NmTHC: a hybrid error correction method based on a generative neural machine translation model with transfer learning
Wang, Rongshu
Chen, Jianhua
BMC GENOMICS, 2024, 25 (01):

← 1 2 3 4 5 →