Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

被引:0
|
作者
Chen, Guanhua [1 ]
Ma, Shuming [2 ]
Chen, Yun [3 ]
Dong, Li [2 ]
Zhang, Dongdong [2 ]
Pan, Jia [1 ]
Wang, Wenping [1 ,4 ]
Wei, Furu [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Texas A&M Univ, College Stn, TX 77843 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on supervised machine translation with BERT. However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of NMT model. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with parallel dataset of only one language pair and an off-the-shelf MPE, then it is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. SixT leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. Using this method, SixT significantly outperforms mBART, a pretrained multilingual encoder-decoder model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong multilingual NMT baselines.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 50 条
  • [21] Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking
    Schumacher, Elliot
    Mayfield, James
    Dredze, Mark
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 583 - 595
  • [22] Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction
    Huang, Kuan-Hao
    Hsu, I-Hung
    Natarajan, Premkumar
    Chang, Kai-Wei
    Peng, Nanyun
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4633 - 4646
  • [23] Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
    Huang, Po-Yao
    Patrick, Mandela
    Hu, Junjie
    Neubig, Graham
    Metze, Florian
    Hauptmann, Alexander
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2443 - 2459
  • [24] Zero-Shot Cross-Lingual Opinion Target Extraction
    Jebbara, Soufian
    Cimiano, Philipp
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2486 - 2495
  • [25] Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation
    Siddhant, Aditya
    Johnson, Melvin
    Tsai, Henry
    Ari, Naveen
    Riesa, Jason
    Bapna, Ankur
    Firat, Orhan
    Raman, Karthik
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8854 - 8861
  • [26] XeroAlign: Zero-Shot Cross-lingual Transformer Alignment
    Gritta, Milan
    Iacobacci, Ignacio
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 371 - 381
  • [27] Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models
    Shaheen, Zein
    Wohlgenannt, Gerhard
    Mouromtsev, Dmitry
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 450 - 456
  • [28] Zero-shot cross-lingual transfer language selection using linguistic similarity
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [29] Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training
    Huang, Kuan-Hao
    Ahmad, Wasi Uddin
    Peng, Nanyun
    Chang, Kai-Wei
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1684 - 1697
  • [30] Transfer language selection for zero-shot cross-lingual abusive language detection
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    Arata, Masaki
    Leliwa, Gniewosz
    Wroczynski, Michal
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)