Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation

被引:3
|
作者
Zhang, Wenbo [1 ,2 ,3 ]
Li, Xiao [1 ,2 ,3 ]
Yang, Yating [1 ,2 ,3 ]
Dong, Rui [1 ,2 ,3 ]
Luo, Gongxu [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi 830011, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830011, Peoples R China
来源
FUTURE INTERNET | 2020年 / 12卷 / 12期
基金
中国国家自然科学基金;
关键词
low-resource neural machine translation; monolingual data; pretraining; transformer;
D O I
10.3390/fi12120215
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, the pretraining of models has been successfully applied to unsupervised and semi-supervised neural machine translation. A cross-lingual language model uses a pretrained masked language model to initialize the encoder and decoder of the translation model, which greatly improves the translation quality. However, because of a mismatch in the number of layers, the pretrained model can only initialize part of the decoder's parameters. In this paper, we use a layer-wise coordination transformer and a consistent pretraining translation transformer instead of a vanilla transformer as the translation model. The former has only an encoder, and the latter has an encoder and a decoder, but the encoder and decoder have exactly the same parameters. Both models can guarantee that all parameters in the translation model can be initialized by the pretrained model. Experiments on the Chinese-English and English-German datasets show that compared with the vanilla transformer baseline, our models achieve better performance with fewer parameters when the parallel corpus is small.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [31] Neural machine translation for low-resource languages without parallel corpora
    Karakanta, Alina
    Dehdari, Jon
    van Genabith, Josef
    MACHINE TRANSLATION, 2018, 32 (1-2) : 167 - 189
  • [32] Regressing Word and Sentence Embeddings for Low-Resource Neural Machine Translation
    Unanue I.J.
    Borzeshi E.Z.
    Piccardi M.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (03): : 450 - 463
  • [33] Efficient Low-Resource Neural Machine Translation with Reread and Feedback Mechanism
    Yu, Zhiqiang
    Yu, Zhengtao
    Guo, Junjun
    Huang, Yuxin
    Wen, Yonghua
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (03)
  • [34] Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation
    Luo, Gongxu
    Yang, Yating
    Yuan, Yang
    Chen, Zhanheng
    Ainiwaer, Aizimaiti
    IEEE ACCESS, 2019, 7 : 154157 - 154166
  • [35] Enhancing distant low-resource neural machine translation with semantic pivot
    Zhu, Enchang
    Huang, Yuxin
    Xian, Yantuan
    Zhu, Junguo
    Gao, Minghu
    Yu, Zhiqiang
    Alexandria Engineering Journal, 2025, 116 : 633 - 643
  • [36] Machine Translation into Low-resource Language Varieties
    Kumar, Sachin
    Anastasopoulos, Antonios
    Wintner, Shuly
    Tsvetkov, Yulia
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 110 - 121
  • [37] ANALYZING ASR PRETRAINING FOR LOW-RESOURCE SPEECH-TO-TEXT TRANSLATION
    Stoian, Mihaela C.
    Bansal, Sameer
    Goldwater, Sharon
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7909 - 7913
  • [38] Neural Machine Translation Advised by Statistical Machine Translation: The Case of Farsi-Spanish Bilingually Low-Resource Scenario
    Ahmadnia, Benyamin
    Kordjamshidi, Parisa
    Haffari, Gholamreza
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 1209 - 1213
  • [39] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Salam Michael Singh
    Thoudam Doren Singh
    Neural Computing and Applications, 2022, 34 : 14823 - 14844
  • [40] Pseudotext Injection and Advance Filtering of Low-Resource Corpus for Neural Machine Translation
    Adjeisah, Michael
    Liu, Guohua
    Nyabuga, Douglas Omwenga
    Nortey, Richard Nuetey
    Song, Jinling
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021