Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders

被引:0
|
作者
Chen, Guanhua [1 ]
Ma, Shuming [2 ]
Chen, Yun [3 ]
Dong, Li [2 ]
Zhang, Dongdong [2 ]
Pan, Jia [1 ]
Wang, Wenping [1 ,4 ]
Wei, Furu [2 ]
机构
[1] Univ Hong Kong, Hong Kong, Peoples R China
[2] Microsoft Res, Redmond, WA USA
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Texas A&M Univ, College Stn, TX 77843 USA
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work mainly focuses on improving cross-lingual transfer for NLU tasks with a multilingual pretrained encoder (MPE), or improving the performance on supervised machine translation with BERT. However, it is under-explored that whether the MPE can help to facilitate the cross-lingual transferability of NMT model. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with parallel dataset of only one language pair and an off-the-shelf MPE, then it is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. SixT leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. Using this method, SixT significantly outperforms mBART, a pretrained multilingual encoder-decoder model explicitly designed for NMT, with an average improvement of 7.1 BLEU on zero-shot any-to-English test sets across 14 source languages. Furthermore, with much less training computation cost and training data, our model achieves better performance on 15 any-to-English test sets than CRISS and m2m-100, two strong multilingual NMT baselines.
引用
收藏
页码:15 / 26
页数:12
相关论文
共 50 条
  • [41] Zero-Shot Learning for Cross-Lingual News Sentiment Classification
    Pelicon, Andraz
    Pranjic, Marko
    Miljkovic, Dragana
    Skrlj, Blaz
    Pollak, Senja
    APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [42] Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer
    Xu, Weijia
    Haider, Batool
    Krone, Jason
    Mansour, Saab
    1ST WORKSHOP ON META LEARNING AND ITS APPLICATIONS TO NATURAL LANGUAGE PROCESSING (METANLP 2021), 2021, : 11 - 18
  • [43] BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer
    Parovic, Marinela
    Glavas, Goran
    Vulic, Ivan
    Korhonen, Anna
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1791 - 1799
  • [44] An Empirical Investigation of Word Alignment Supervision for Zero-Shot Multilingual Neural Machine Translation
    Raganato, Alessandro
    Vazquez, Raul
    Creutz, Mathias
    Tiedemann, Jorg
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8449 - 8456
  • [45] Combining Cross-lingual and Cross-task Supervision for Zero-Shot Learning
    Pikuliak, Matus
    Simko, Marian
    TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 162 - 170
  • [46] Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing
    Shi, Freda
    Gimpel, Kevin
    Livescu, Karen
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6547 - 6563
  • [47] Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model
    Hsu, Tsung-Yuan
    Liu, Chi-liang
    Lee, Hung-yi
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5933 - 5940
  • [48] Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering
    Riabi, Arij
    Scialom, Thomas
    Keraron, Rachel
    Sagot, Benoit
    Seddah, Djame
    Staiano, Jacopo
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7016 - 7030
  • [49] Zero-Shot Text Normalization via Cross-lingual Knowledge Distillation
    Wang L.
    Huang X.
    Yu Z.
    Peng H.
    Gao S.
    Mao C.
    Huang Y.
    Dong L.
    Yu P.S.
    IEEE/ACM Transactions on Audio Speech and Language Processing, 2024, 32 : 1 - 16
  • [50] Zero-shot Cross-lingual Dialogue Systems with Transferable Latent Variables
    Liu, Zihan
    Shin, Jamin
    Xu, Yan
    Winata, Genta Indra
    Xu, Peng
    Madotto, Andrea
    Fung, Pascale
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1297 - 1303