Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

被引:0
|
作者
Lin, Zehui [1 ,2 ]
Pan, Xiao [1 ]
Wang, Mingxuan [1 ]
Qiu, Xipeng [2 ]
Feng, Jiangtao [1 ]
Zhou, Hao [1 ]
Li, Lei [1 ]
机构
[1] ByteDance Lab, Beijing, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple lowresource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pretraining corpus. Code, data, and pre-trained models are available at https://github.com/linzehui/mRASP.
引用
收藏
页码:2649 / 2663
页数:15
相关论文
共 50 条
  • [41] Survey on Neural Machine Translation for multilingual translation system
    Basmatkar, Pranjali
    Holani, Hemant
    Kaushal, Shivani
    PROCEEDINGS OF THE 2019 3RD INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC 2019), 2019, : 443 - 448
  • [42] From bilingual to multilingual neural-based machine translation by incremental training
    Escolano, Carlos
    Costa-Jussa, Marta R.
    Fonollosa, Jose A. R.
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2021, 72 (02) : 190 - 203
  • [43] Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?
    Tamura, Hiroto
    Hirasawa, Tosho
    Kim, Hwichan
    Komachi, Mamoru
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2216 - 2225
  • [44] Pre-training Methods in Information Retrieval
    Fan, Yixing
    Xie, Xiaohui
    Cai, Yinqiong
    Chen, Jia
    Ma, Xinyu
    Li, Xiangsheng
    Zhang, Ruqing
    Guo, Jiafeng
    FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2022, 16 (03): : 178 - 317
  • [45] Multilingual Constituency Parsing with Self-Attention and Pre-Training
    Kitaev, Nikita
    Cao, Steven
    Klein, Dan
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3499 - 3505
  • [46] Multilingual Molecular Representation Learning via Contrastive Pre-training
    Guo, Zhihui
    Sharma, Pramod
    Martinez, Andy
    Du, Liang
    Abraham, Robin
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3441 - 3453
  • [47] Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition
    Li, Zhen
    Qu, Dan
    Xie, Chaojie
    Zhang, Wenlin
    Li, Yanxia
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (7-8)
  • [48] On the Pareto Front of Multilingual Neural Machine Translation
    Chen, Liang
    Ma, Shuming
    Zhang, Dongdong
    Wei, Furu
    Chang, Baobao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [49] Multilingual Neural Machine Translation with Language Clustering
    Tan, Xu
    Chen, Jiale
    He, Di
    Xia, Yingce
    Qin, Tao
    Liu, Tie-Yan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 963 - 973
  • [50] Efficient Inference for Multilingual Neural Machine Translation
    Berard, Alexandre
    Lee, Dain
    Clinchant, Stephane
    Jung, Kweonwoo
    Nikoulina, Vassilina
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8563 - 8583