Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation

被引:0
|
作者
Ji, Baijun [2 ]
Zhang, Zhirui [3 ]
Duan, Xiangyu [1 ,2 ]
Zhang, Min [1 ,2 ]
Chen, Boxing [3 ]
Luo, Weihua [3 ]
机构
[1] Soochow Univ, Inst Artificial Intelligence, Suzhou, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[3] Alibaba DAMO Acad, Hangzhou, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer learning approach based on cross-lingual pre-training. Our key idea is to make all source languages share the same feature space and thus enable a smooth transition for zero-shot translation. To this end, we introduce one monolingual pre-training method and two bilingual pre-training methods to obtain a universal encoder for different languages. Once the universal encoder is constructed, the parent model built on such encoder is trained with large-scale annotated data and then directly applied in zero-shot translation scenario. Experiments on two public datasets show that our approach significantly outperforms strong pivot-based baseline and various multilingual NMT approaches.
引用
收藏
页码:115 / 122
页数:8
相关论文
共 50 条
  • [21] English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too
    Phang, Jason
    Calixto, Iacer
    Htut, Phu Mon
    Pruksachatkun, Yada
    Liu, Haokun
    Vania, Clara
    Kann, Katharina
    Bowman, Samuel R.
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 557 - 575
  • [22] Zero-Shot Cross-Lingual Opinion Target Extraction
    Jebbara, Soufian
    Cimiano, Philipp
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2486 - 2495
  • [23] XeroAlign: Zero-Shot Cross-lingual Transformer Alignment
    Gritta, Milan
    Iacobacci, Ignacio
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 371 - 381
  • [24] Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models
    Shaheen, Zein
    Wohlgenannt, Gerhard
    Mouromtsev, Dmitry
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 450 - 456
  • [25] Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
    Artetxe, Mikel
    Schwenk, Holger
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 597 - 610
  • [26] Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension
    Wu, Linjuan
    Wu, Shaojuan
    Zhang, Xiaowang
    Xiong, Deyi
    Chen, Shizhan
    Zhuang, Zhiqiang
    Feng, Zhiyong
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 991 - 1000
  • [27] Zero-shot cross-lingual transfer language selection using linguistic similarity
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [28] Transfer language selection for zero-shot cross-lingual abusive language detection
    Eronen, Juuso
    Ptaszynski, Michal
    Masui, Fumito
    Arata, Masaki
    Leliwa, Gniewosz
    Wroczynski, Michal
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [29] The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer
    Efimov, Pavel
    Boytsov, Leonid
    Arslanova, Elena
    Braslavski, Pavel
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 51 - 67
  • [30] Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization
    Ponti, Edoardo M.
    Vulic, Ivan
    Glavas, Goran
    Mrksic, Nikola
    Korhonen, Anna
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 282 - 293