CSP: Code-Switching Pre-training for Neural Machine Translation

被引：0

作者：

Yang, Zhen ^{[1
]}

Hu, Bojie ^{[1
]}

Han, Ambyera ^{[1
]}

Huang, Shen ^{[1
]}

Ju, Qi ^{[1
]}

机构：

[1] Tencent Minor Mandarin Translat, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a new pre-training method, called Code-Switching Pre-training (CSP for short) for Neural Machine Translation (NMT). Unlike traditional pre-training method which randomly masks some fragments of the input sentence, the proposed CSP randomly replaces some words in the source sentence with their translation words in the target language. Specifically, we firstly perform lexicon induction with unsupervised word embedding mapping between the source and target languages, and then randomly replace some words in the input sentence with their translation words according to the extracted translation lexicons. CSP adopts the encoderdecoder framework: its encoder takes the codemixed sentence as input, and its decoder predicts the replaced fragment of the input sentence. In this way, CSP is able to pre-train the NMT model by explicitly making the most of the cross-lingual alignment information extracted from the source and target monolingual corpus. Additionally, we relieve the pretrainfinetune discrepancy caused by the artificial symbols like [mask]. To verify the effectiveness of the proposed method, we conduct extensive experiments on unsupervised and supervised NMT. Experimental results show that CSP achieves significant improvements over baselines without pre-training or with other pre-training methods.

引用

页码：2624 / 2636

页数：13

共 50 条

[1] Pre-training Methods for Neural Machine Translation
Wang, Mingxuan
Li, Lei
[J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: TUTORIAL ABSTRACTS, 2021, : 21 - 25
[2] Multilingual Denoising Pre-training for Neural Machine Translation
Liu, Yinhan
Gu, Jiatao
Goyal, Naman
Li, Xian
Edunov, Sergey
Ghazvininejad, Marjan
Lewis, Mike
Zettlemoyer, Luke
[J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 726 - 742
[3] On the Copying Behaviors of Pre-Training for Neural Machine Translation
Liu, Xuebo
Wang, Longyue
Wong, Derek F.
Ding, Liang
Chao, Lidia S.
Shi, Shuming
Tu, Zhaopeng
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4265 - 4275
[4] Curriculum pre-training for stylized neural machine translation
Zou, Aixiao
Wu, Xuanxuan
Li, Xinjie
Zhang, Ting
Cui, Fuwei
Xu, Jinan
[J]. APPLIED INTELLIGENCE, 2024, 54 (17-18) : 7958 - 7968
[5] Improved Deliberation Network with Text Pre-training for Code-Switching Automatic Speech Recognition
Shen, Zhijie
Guo, Wu
[J]. INTERSPEECH 2022, 2022, : 3854 - 3858
[6] DEEP: DEnoising Entity Pre-training for Neural Machine Translation
Hu, Junjie
Hayashi, Hiroaki
Cho, Kyunghyun
Neubig, Graham
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1753 - 1766
[7] On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation
Liu, Xuebo
Wang, Longyue
Wong, Derek F.
Ding, Liang
Chao, Lidia S.
Shi, Shuming
Tu, Zhaopeng
[J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2900 - 2907
[8] Universal Conditional Masked Language Pre-training for Neural Machine Translation
Li, Pengfei
Li, Liangyou
Zhang, Meng
Wu, Minghao
Liu, Qun
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6379 - 6391
[9] Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
Lin, Zehui
Pan, Xiao
Wang, Mingxuan
Qiu, Xipeng
Feng, Jiangtao
Zhou, Hao
Li, Lei
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2649 - 2663
[10] Pre-training via Leveraging Assisting Languages for Neural Machine Translation
Song, Haiyue
Dabre, Raj
Mao, Zhuoyuan
Cheng, Fei
Kurohashi, Sadao
Sumita, Eiichiro
[J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 279 - 285

← 1 2 3 4 5 →