Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation

被引:0
|
作者
Ji, Baijun [2 ]
Zhang, Zhirui [3 ]
Duan, Xiangyu [1 ,2 ]
Zhang, Min [1 ,2 ]
Chen, Boxing [3 ]
Luo, Weihua [3 ]
机构
[1] Soochow Univ, Inst Artificial Intelligence, Suzhou, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou, Peoples R China
[3] Alibaba DAMO Acad, Hangzhou, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer learning approach based on cross-lingual pre-training. Our key idea is to make all source languages share the same feature space and thus enable a smooth transition for zero-shot translation. To this end, we introduce one monolingual pre-training method and two bilingual pre-training methods to obtain a universal encoder for different languages. Once the universal encoder is constructed, the parent model built on such encoder is trained with large-scale annotated data and then directly applied in zero-shot translation scenario. Experiments on two public datasets show that our approach significantly outperforms strong pivot-based baseline and various multilingual NMT approaches.
引用
收藏
页码:115 / 122
页数:8
相关论文
共 50 条
  • [31] Feature Aggregation in Zero-Shot Cross-Lingual Transfer Using Multilingual BERT
    Chen, Beiduo
    Guo, Wu
    Liu, Quan
    Tao, Kun
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1428 - 1435
  • [32] Zero-Shot Cross-Lingual Knowledge Transfer in VQA via Multimodal Distillation
    Weng, Yu
    Dong, Jun
    He, Wenbin
    Chaomurilige
    Liu, Xuan
    Liu, Zheng
    Gao, Honghao
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, : 1 - 11
  • [33] Mixed-Lingual Pre-training for Cross-lingual Summarization
    Xu, Ruochen
    Zhu, Chenguang
    Shi, Yu
    Zeng, Michael
    Huang, Xuedong
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 536 - 541
  • [34] Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-training
    Hardalov, Momchil
    Arora, Arnav
    Nakov, Preslav
    Augenstein, Isabelle
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10729 - 10737
  • [35] XLPT-AMR: Cross-Lingual Pre-Training via Multi-Task Learning for Zero-Shot AMR Parsing and Text Generation
    Xu, Dongqin
    Li, Junhui
    Zhu, Muhua
    Zhang, Min
    Zhou, Guodong
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 896 - 907
  • [36] Translation-Based Implicit Annotation Projection for Zero-Shot Cross-Lingual Event Argument Extraction
    Lou, Chenwei
    Gao, Jun
    Yu, Changlong
    Wang, Wei
    Zhao, Huan
    Tu, Weiwei
    Xu, Ruifeng
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2076 - 2081
  • [37] Towards zero-shot cross-lingual named entity disambiguation
    Barrena, Ander
    Soroa, Aitor
    Agirre, Eneko
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184
  • [38] (ALMOST) ZERO-SHOT CROSS-LINGUAL SPOKEN LANGUAGE UNDERSTANDING
    Upadhyay, Shyam
    Faruqui, Manaal
    Tur, Gokhan
    Hakkani-Tur, Dilek
    Heck, Larry
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6034 - 6038
  • [39] Cross-lingual Contextualized Topic Models with Zero-shot Learning
    Bianchi, Federico
    Terragni, Silvia
    Hovy, Dirk
    Nozza, Debora
    Fersini, Elisabetta
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1676 - 1683
  • [40] Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing
    Wang, Yuxuan
    Che, Wanxiang
    Guo, Jiang
    Liu, Yijia
    Liu, Ting
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5721 - 5727