Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation

被引：3

作者：

Zhang, Wenbo ^{[1
,2
,3
]}

Li, Xiao ^{[1
,2
,3
]}

Yang, Yating ^{[1
,2
,3
]}

Dong, Rui ^{[1
,2
,3
]}

Luo, Gongxu ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi 830011, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830011, Peoples R China

来源：

FUTURE INTERNET | 2020年 / 12卷 / 12期

基金：

中国国家自然科学基金;

关键词：

low-resource neural machine translation; monolingual data; pretraining; transformer;

D O I：

10.3390/fi12120215

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, the pretraining of models has been successfully applied to unsupervised and semi-supervised neural machine translation. A cross-lingual language model uses a pretrained masked language model to initialize the encoder and decoder of the translation model, which greatly improves the translation quality. However, because of a mismatch in the number of layers, the pretrained model can only initialize part of the decoder's parameters. In this paper, we use a layer-wise coordination transformer and a consistent pretraining translation transformer instead of a vanilla transformer as the translation model. The former has only an encoder, and the latter has an encoder and a decoder, but the encoder and decoder have exactly the same parameters. Both models can guarantee that all parameters in the translation model can be initialized by the pretrained model. Experiments on the Chinese-English and English-German datasets show that compared with the vanilla transformer baseline, our models achieve better performance with fewer parameters when the parallel corpus is small.

引用

页码：1 / 13

页数：13

共 50 条

[41] Improved neural machine translation for low-resource English-Assamese pair
Laskar, Sahinur Rahman
Khilji, Abdullah Faiz Ur Rahman
Pakray, Partha
Bandyopadhyay, Sivaji
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4727 - 4738
[42] A Bilingual Templates Data Augmentation Method for Low-Resource Neural Machine Translation
Li, Fuxue
Liu, Beibei
Yan, Hong
Shao, Mingzhi
Xie, Peijun
Li, Jiarui
Chi, Chuncheng
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 40 - 51
[43] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
Singh, Salam Michael
Singh, Thoudam Doren
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17): : 14823 - 14844
[44] Pre-Training on Mixed Data for Low-Resource Neural Machine Translation
Zhang, Wenbo
Li, Xiao
Yang, Yating
Dong, Rui
INFORMATION, 2021, 12 (03)
[45] Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation
Mi, Chenggang
Xie, Shaoliang
Fan, Yi
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)
[46] Extremely Low-resource Multilingual Neural Machine Translation for Indic Mizo Language
Lalrempuii C.
Soni B.
International Journal of Information Technology, 2023, 15 (8) : 4275 - 4282
[47] low-resource neural Machine translation with Multi-strategy prototype generation
Yu Z.-Q.
Yu Z.-T.
Huang Y.-X.
Guo J.-J.
Xian Y.-T.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (11): : 5113 - 5125
[48] STA: An efficient data augmentation method for low-resource neural machine translation
Li, Fuxue
Chi, Chuncheng
Yan, Hong
Liu, Beibei
Shao, Mingzhi
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (01) : 121 - 132
[49] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
Singh, Salam Michael
Singh, Thoudam Doren
Neural Computing and Applications, 2022, 34 (17) : 14823 - 14844
[50] The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation
Saleva, Jonne
Lignos, Constantine
EACL 2021: THE 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 164 - 174

← 1 2 3 4 5 →