Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

被引:0
|
作者
Lin, Zehui [1 ,2 ]
Pan, Xiao [1 ]
Wang, Mingxuan [1 ]
Qiu, Xipeng [2 ]
Feng, Jiangtao [1 ]
Zhou, Hao [1 ]
Li, Lei [1 ]
机构
[1] ByteDance Lab, Beijing, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple lowresource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pretraining corpus. Code, data, and pre-trained models are available at https://github.com/linzehui/mRASP.
引用
收藏
页码:2649 / 2663
页数:15
相关论文
共 50 条
  • [1] Multilingual Denoising Pre-training for Neural Machine Translation
    Liu, Yinhan
    Gu, Jiatao
    Goyal, Naman
    Li, Xian
    Edunov, Sergey
    Ghazvininejad, Marjan
    Lewis, Mike
    Zettlemoyer, Luke
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 726 - 742
  • [2] Pre-training neural machine translation with alignment information via optimal transport
    Su, Xueping
    Zhao, Xingkai
    Ren, Jie
    Li, Yunhong
    Raetsch, Matthias
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (16) : 48377 - 48397
  • [3] Pre-training neural machine translation with alignment information via optimal transport
    Xueping Su
    Xingkai Zhao
    Jie Ren
    Yunhong Li
    Matthias Rätsch
    Multimedia Tools and Applications, 2024, 83 : 48377 - 48397
  • [4] Pre-training via Leveraging Assisting Languages for Neural Machine Translation
    Song, Haiyue
    Dabre, Raj
    Mao, Zhuoyuan
    Cheng, Fei
    Kurohashi, Sadao
    Sumita, Eiichiro
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 279 - 285
  • [5] Multilingual Pre-training Model-Assisted Contrastive Learning Neural Machine Translation
    Sun, Shuo
    Hou, Hong-xu
    Yang, Zong-heng
    Wang, Yi-song
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [6] Pre-training Methods for Neural Machine Translation
    Wang, Mingxuan
    Li, Lei
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: TUTORIAL ABSTRACTS, 2021, : 21 - 25
  • [7] Curriculum pre-training for stylized neural machine translation
    Zou, Aixiao
    Wu, Xuanxuan
    Li, Xinjie
    Zhang, Ting
    Cui, Fuwei
    Xu, Jinan
    APPLIED INTELLIGENCE, 2024, 54 (17-18) : 7958 - 7968
  • [8] On the Copying Behaviors of Pre-Training for Neural Machine Translation
    Liu, Xuebo
    Wang, Longyue
    Wong, Derek F.
    Ding, Liang
    Chao, Lidia S.
    Shi, Shuming
    Tu, Zhaopeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4265 - 4275
  • [9] Multilingual Translation from Denoising Pre-Training
    Tang, Yuqing
    Tran, Chau
    Li, Xian
    Chen, Peng-Jen
    Goyal, Naman
    Chaudhary, Vishrav
    Gu, Jiatao
    Fan, Angela
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3450 - 3466
  • [10] DEEP: DEnoising Entity Pre-training for Neural Machine Translation
    Hu, Junjie
    Hayashi, Hiroaki
    Cho, Kyunghyun
    Neubig, Graham
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1753 - 1766