Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

被引:0
|
作者
Chen, Guanhua [1 ]
Ma, Shuming [2 ]
Chen, Yun [3 ]
Dong, Li [2 ]
Zhang, Dongdong [2 ]
Pan, Jia [1 ]
Wang, Wenping [1 ,4 ]
Wei, Furu [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Texas A&M Univ, College Stn, TX 77843 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on supervised machine translation with BERT. However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of NMT model. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with parallel dataset of only one language pair and an off-the-shelf MPE, then it is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. SixT leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. Using this method, SixT significantly outperforms mBART, a pretrained multilingual encoder-decoder model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong multilingual NMT baselines.
引用
下载
收藏
页码:15 / 26
页数:12
相关论文
共 50 条
  • [1] Unsupervised multilingual machine translation with pretrained cross-lingual encoders
    Shen, Yingli
    Bao, Wei
    Gao, Ge
    Zhou, Maoke
    Zhao, Xiaobing
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [2] Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation
    Chen, Guanhua
    Ma, Shuming
    Chen, Yun
    Zhang, Dongdong
    Pan, Jia
    Wang, Wenping
    Wei, Furu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 142 - 157
  • [3] Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation
    Ji, Baijun
    Zhang, Zhirui
    Duan, Xiangyu
    Zhang, Min
    Chen, Boxing
    Luo, Weihua
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 115 - 122
  • [4] Zero-Shot Neural Transfer for Cross-Lingual Entity Linking
    Rijhwani, Shruti
    Xie, Jiateng
    Neubig, Graham
    Carbonell, Jaime
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6924 - 6931
  • [5] Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
    Artetxe, Mikel
    Schwenk, Holger
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 597 - 610
  • [6] Feature Aggregation in Zero-Shot Cross-Lingual Transfer Using Multilingual BERT
    Chen, Beiduo
    Guo, Wu
    Liu, Quan
    Tao, Kun
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1428 - 1435
  • [7] Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension
    Wu, Linjuan
    Wu, Shaojuan
    Zhang, Xiaowang
    Xiong, Deyi
    Chen, Shizhan
    Zhuang, Zhiqiang
    Feng, Zhiyong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 991 - 1000
  • [8] Zero-Shot Cross-Lingual Neural Headline Generation
    Ayana
    Shen, Shi-qi
    Chen, Yun
    Yang, Cheng
    Liu, Zhi-yuan
    Sun, Mao-song
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2319 - 2327
  • [9] Zero-Shot Cross-Lingual Transfer with Meta Learning
    Nooralahzadeh, Farhad
    Bekoulis, Giannis
    Bjerva, Johannes
    Augenstein, Isabelle
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4547 - 4562
  • [10] Reinforced Zero-Shot Cross-Lingual Neural Headline Generation
    Ayana
    Chen, Yun
    Yang, Cheng
    Liu, Zhiyuan
    Sun, Maosong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 2572 - 2584