Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

被引:0
|
作者
Lin, Zehui [1 ,2 ]
Pan, Xiao [1 ]
Wang, Mingxuan [1 ]
Qiu, Xipeng [2 ]
Feng, Jiangtao [1 ]
Zhou, Hao [1 ]
Li, Lei [1 ]
机构
[1] ByteDance Lab, Beijing, Peoples R China
[2] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple lowresource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pretraining corpus. Code, data, and pre-trained models are available at https://github.com/linzehui/mRASP.
引用
收藏
页码:2649 / 2663
页数:15
相关论文
共 50 条
  • [21] Graph Neural Pre-training for Recommendation with Side Information
    Liu, Siwei
    Meng, Zaiqiao
    Macdonald, Craig
    Ounis, Iadh
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
  • [22] From Bilingual to Multilingual Neural Machine Translation by Incremental Training
    Escolano, Carlos
    Costa-Jussa, Marta R.
    Fonollosa, Jose A. R.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 236 - 242
  • [23] Discovering Representation Sprachbund For Multilingual Pre-Training
    Fan, Yimin
    Liang, Yaobo
    Muzio, Alexandre
    Hassan, Hany
    Li, Houqiang
    Zhou, Ming
    Duan, Nan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 881 - 894
  • [24] Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
    Siddhant, Aditya
    Bapna, Ankur
    Cao, Yuan
    Firat, Orhan
    Chen, Mia
    Kudungunta, Sneha
    Arivazhagan, Naveen
    Wu, Yonghui
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2827 - 2835
  • [25] Multilingual Pre-training with Universal Dependency Learning
    Sun, Kailai
    Li, Zuchao
    Zhao, Hai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [26] Multilingual Agreement for Multilingual Neural Machine Translation
    Yang, Jian
    Yin, Yuwei
    Ma, Shuming
    Huang, Haoyang
    Zhang, Dongdong
    Li, Zhoujun
    Wei, Furu
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 233 - 239
  • [27] Character-Aware Low-Resource Neural Machine Translation with Weight Sharing and Pre-training
    Cao, Yichao
    Li, Miao
    Feng, Tao
    Wang, Rujing
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 321 - 333
  • [28] XLIT: A Method to Bridge Task Discrepancy in Machine Translation Pre-training
    Pham, Khang
    Nguyen, Long
    Dinh, Dien
    ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23 (10)
  • [29] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
    Mao, Zhuoyuan
    Chu, Chenhui
    Kurohashi, Sadao
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [30] Cross-lingual Visual Pre-training for Multimodal Machine Translation
    Caglayan, Ozan
    Kuyu, Menekse
    Amac, Mustafa Sercan
    Madhyastha, Pranava
    Erdem, Erkut
    Erdem, Aykut
    Specia, Lucia
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1317 - 1324