Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

被引：0

作者：

Lin, Zehui ^{[1
,2
]}

Pan, Xiao ^{[1
]}

Wang, Mingxuan ^{[1
]}

Qiu, Xipeng ^{[2
]}

Feng, Jiangtao ^{[1
]}

Zhou, Hao ^{[1
]}

Li, Lei ^{[1
]}

机构：

[1] ByteDance Lab, Beijing, Peoples R China

[2] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China

来源：

PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple lowresource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pretraining corpus. Code, data, and pre-trained models are available at https://github.com/linzehui/mRASP.

引用

页码：2649 / 2663

页数：15

共 50 条

[21] Graph Neural Pre-training for Recommendation with Side Information
Liu, Siwei
Meng, Zaiqiao
Macdonald, Craig
Ounis, Iadh
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
[22] From Bilingual to Multilingual Neural Machine Translation by Incremental Training
Escolano, Carlos
Costa-Jussa, Marta R.
Fonollosa, Jose A. R.
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 236 - 242
[23] Discovering Representation Sprachbund For Multilingual Pre-Training
Fan, Yimin
Liang, Yaobo
Muzio, Alexandre
Hassan, Hany
Li, Houqiang
Zhou, Ming
Duan, Nan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 881 - 894
[24] Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Siddhant, Aditya
Bapna, Ankur
Cao, Yuan
Firat, Orhan
Chen, Mia
Kudungunta, Sneha
Arivazhagan, Naveen
Wu, Yonghui
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2827 - 2835
[25] Multilingual Pre-training with Universal Dependency Learning
Sun, Kailai
Li, Zuchao
Zhao, Hai
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[26] Multilingual Agreement for Multilingual Neural Machine Translation
Yang, Jian
Yin, Yuwei
Ma, Shuming
Huang, Haoyang
Zhang, Dongdong
Li, Zhoujun
Wei, Furu
ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 233 - 239
[27] Character-Aware Low-Resource Neural Machine Translation with Weight Sharing and Pre-training
Cao, Yichao
Li, Miao
Feng, Tao
Wang, Rujing
CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 321 - 333
[28] XLIT: A Method to Bridge Task Discrepancy in Machine Translation Pre-training
Pham, Khang
Nguyen, Long
Dinh, Dien
ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23 (10)
[29] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
Mao, Zhuoyuan
Chu, Chenhui
Kurohashi, Sadao
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
[30] Cross-lingual Visual Pre-training for Multimodal Machine Translation
Caglayan, Ozan
Kuyu, Menekse
Amac, Mustafa Sercan
Madhyastha, Pranava
Erdem, Erkut
Erdem, Aykut
Specia, Lucia
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 1317 - 1324

← 1 2 3 4 5 →