CSP: Code-Switching Pre-training for Neural Machine Translation

被引:0
|
作者
Yang, Zhen [1 ]
Hu, Bojie [1 ]
Han, Ambyera [1 ]
Huang, Shen [1 ]
Ju, Qi [1 ]
机构
[1] Tencent Minor Mandarin Translat, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a new pre-training method, called Code-Switching Pre-training (CSP for short) for Neural Machine Translation (NMT). Unlike traditional pre-training method which randomly masks some fragments of the input sentence, the proposed CSP randomly replaces some words in the source sentence with their translation words in the target language. Specifically, we firstly perform lexicon induction with unsupervised word embedding mapping between the source and target languages, and then randomly replace some words in the input sentence with their translation words according to the extracted translation lexicons. CSP adopts the encoderdecoder framework: its encoder takes the codemixed sentence as input, and its decoder predicts the replaced fragment of the input sentence. In this way, CSP is able to pre-train the NMT model by explicitly making the most of the cross-lingual alignment information extracted from the source and target monolingual corpus. Additionally, we relieve the pretrainfinetune discrepancy caused by the artificial symbols like [mask]. To verify the effectiveness of the proposed method, we conduct extensive experiments on unsupervised and supervised NMT. Experimental results show that CSP achieves significant improvements over baselines without pre-training or with other pre-training methods.
引用
收藏
页码:2624 / 2636
页数:13
相关论文
共 50 条
  • [1] Pre-training Methods for Neural Machine Translation
    Wang, Mingxuan
    Li, Lei
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: TUTORIAL ABSTRACTS, 2021, : 21 - 25
  • [2] Multilingual Denoising Pre-training for Neural Machine Translation
    Liu, Yinhan
    Gu, Jiatao
    Goyal, Naman
    Li, Xian
    Edunov, Sergey
    Ghazvininejad, Marjan
    Lewis, Mike
    Zettlemoyer, Luke
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 726 - 742
  • [3] On the Copying Behaviors of Pre-Training for Neural Machine Translation
    Liu, Xuebo
    Wang, Longyue
    Wong, Derek F.
    Ding, Liang
    Chao, Lidia S.
    Shi, Shuming
    Tu, Zhaopeng
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4265 - 4275
  • [4] Curriculum pre-training for stylized neural machine translation
    Zou, Aixiao
    Wu, Xuanxuan
    Li, Xinjie
    Zhang, Ting
    Cui, Fuwei
    Xu, Jinan
    [J]. APPLIED INTELLIGENCE, 2024, 54 (17-18) : 7958 - 7968
  • [5] Improved Deliberation Network with Text Pre-training for Code-Switching Automatic Speech Recognition
    Shen, Zhijie
    Guo, Wu
    [J]. INTERSPEECH 2022, 2022, : 3854 - 3858
  • [6] DEEP: DEnoising Entity Pre-training for Neural Machine Translation
    Hu, Junjie
    Hayashi, Hiroaki
    Cho, Kyunghyun
    Neubig, Graham
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1753 - 1766
  • [7] On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation
    Liu, Xuebo
    Wang, Longyue
    Wong, Derek F.
    Ding, Liang
    Chao, Lidia S.
    Shi, Shuming
    Tu, Zhaopeng
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2900 - 2907
  • [8] Universal Conditional Masked Language Pre-training for Neural Machine Translation
    Li, Pengfei
    Li, Liangyou
    Zhang, Meng
    Wu, Minghao
    Liu, Qun
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6379 - 6391
  • [9] Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
    Lin, Zehui
    Pan, Xiao
    Wang, Mingxuan
    Qiu, Xipeng
    Feng, Jiangtao
    Zhou, Hao
    Li, Lei
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2649 - 2663
  • [10] Pre-training via Leveraging Assisting Languages for Neural Machine Translation
    Song, Haiyue
    Dabre, Raj
    Mao, Zhuoyuan
    Cheng, Fei
    Kurohashi, Sadao
    Sumita, Eiichiro
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 279 - 285