Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation

被引：3

作者：

Zhang, Wenbo ^{[1
,2
,3
]}

Li, Xiao ^{[1
,2
,3
]}

Yang, Yating ^{[1
,2
,3
]}

Dong, Rui ^{[1
,2
,3
]}

Luo, Gongxu ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi 830011, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830011, Peoples R China

来源：

FUTURE INTERNET | 2020年 / 12卷 / 12期

基金：

中国国家自然科学基金;

关键词：

low-resource neural machine translation; monolingual data; pretraining; transformer;

D O I：

10.3390/fi12120215

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, the pretraining of models has been successfully applied to unsupervised and semi-supervised neural machine translation. A cross-lingual language model uses a pretrained masked language model to initialize the encoder and decoder of the translation model, which greatly improves the translation quality. However, because of a mismatch in the number of layers, the pretrained model can only initialize part of the decoder's parameters. In this paper, we use a layer-wise coordination transformer and a consistent pretraining translation transformer instead of a vanilla transformer as the translation model. The former has only an encoder, and the latter has an encoder and a decoder, but the encoder and decoder have exactly the same parameters. Both models can guarantee that all parameters in the translation model can be initialized by the pretrained model. Experiments on the Chinese-English and English-German datasets show that compared with the vanilla transformer baseline, our models achieve better performance with fewer parameters when the parallel corpus is small.

引用

页码：1 / 13

页数：13

共 50 条

[1] A Survey on Low-Resource Neural Machine Translation
Wang, Rui
Tan, Xu
Luo, Renqian
Qin, Tao
Liu, Tie-Yan
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4636 - 4643
[2] Transformers for Low-resource Neural Machine Translation
Gezmu, Andargachew Mekonnen
Nuernberger, Andreas
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2022, : 459 - 466
[3] A Survey on Low-resource Neural Machine Translation
Li H.-Z.
Feng C.
Huang H.-Y.
Huang, He-Yan (hhy63@bit.edu.cn), 1600, Science Press (47): : 1217 - 1231
[4] Low-Resource Neural Machine Translation with Neural Episodic Control
Wu, Nier
Hou, Hongxu
Sun, Shuo
Zheng, Wei
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[5] Low-resource Neural Machine Translation: Methods and Trends
Shi, Shumin
Wu, Xing
Su, Rihai
Huang, Heyan
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
[6] Neural Machine Translation for Low-resource Languages: A Survey
Ranathunga, Surangika
Lee, En-Shiun Annie
Skenduli, Marjana Prifti
Shekhar, Ravi
Alam, Mehreen
Kaur, Rishemjit
ACM COMPUTING SURVEYS, 2023, 55 (11)
[7] Data Augmentation for Low-Resource Neural Machine Translation
Fadaee, Marzieh
Bisazza, Arianna
Monz, Christof
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 567 - 573
[8] Recent advances of low-resource neural machine translation
Haque, Rejwanul
Liu, Chao-Hong
Way, Andy
MACHINE TRANSLATION, 2021, 35 (04) : 451 - 474
[9] Translation Memories as Baselines for Low-Resource Machine Translation
Knowles, Rebecca
Littell, Patrick
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6759 - 6767
[10] Survey of Low-Resource Machine Translation
Haddow, Barry
Bawden, Rachel
Barone, Antonio Valerio Miceli
Helcl, Jindrich
Birch, Alexandra
COMPUTATIONAL LINGUISTICS, 2022, 48 (03) : 673 - 732

← 1 2 3 4 5 →