Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

被引:0
|
作者
Lin, Zehui [1 ,2 ]
Pan, Xiao [1 ]
Wang, Mingxuan [1 ]
Qiu, Xipeng [2 ]
Feng, Jiangtao [1 ]
Zhou, Hao [1 ]
Li, Lei [1 ]
机构
[1] ByteDance Lab, Beijing, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple lowresource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pretraining corpus. Code, data, and pre-trained models are available at https://github.com/linzehui/mRASP.
引用
收藏
页码:2649 / 2663
页数:15
相关论文
共 50 条
  • [31] Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation
    Ji, Baijun
    Zhang, Zhirui
    Duan, Xiangyu
    Zhang, Min
    Chen, Boxing
    Luo, Weihua
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 115 - 122
  • [32] Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation
    Liu, Zihan
    Winata, Genta Indra
    Fung, Pascale
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2706 - 2718
  • [33] Explicit Cross-lingual Pre-training for Unsupervised Machine Translation
    Ren, Shuo
    Wu, Yu
    Liu, Shujie
    Zhou, Ming
    Ma, Shuai
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 770 - 779
  • [34] Breaking Corpus Bottleneck for Context-Aware Neural Machine Translation with Cross-Task Pre-training
    Chen, Linqing
    Li, Junhui
    Gong, Zhengxian
    Chen, Boxing
    Luo, Weihua
    Zhang, Min
    Zhou, Guodong
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 2851 - 2861
  • [35] Neural Machine Translation Based on XLM-R Cross-lingual Pre-training Language Model
    Wang Q.
    Li M.
    Wu S.
    Wang M.
    Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58 (01): : 29 - 36
  • [36] Soft Language Clustering for Multilingual Model Pre-training
    Zeng, Jiali
    Jiang, Yufan
    Yin, Yongjing
    Jing, Yi
    Meng, Fandong
    Lin, Binghuai
    Cao, Yunbo
    Zhou, Jie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7021 - 7035
  • [37] A Survey of Multilingual Neural Machine Translation
    Dabre, Raj
    Chu, Chenhui
    Kunchukuttan, Anoop
    ACM COMPUTING SURVEYS, 2020, 53 (05)
  • [38] Massively Multilingual Neural Machine Translation
    Aharoni, Roee
    Johnson, Melvin
    Firat, Orhan
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3874 - 3884
  • [39] Multilingual Simultaneous Neural Machine Translation
    Arthur, Philip
    Ryu, Dongwon K.
    Haffari, Gholamreza
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4758 - 4766
  • [40] Multilingual Pre-training with Language and Task Adaptation for Multilingual Text Style Transfer
    Lai, Huiyuan
    Toral, Antonio
    Nissim, Malvina
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, 2022, : 262 - 271