Prediction Difference Regularization against Perturbation for Neural Machine Translation

被引:0
|
作者
Guo, Dengji [1 ,2 ]
Ma, Zhengrui [1 ,2 ]
Zhang, Min [3 ]
Feng, Yang [1 ,2 ]
机构
[1] Chinese Acad Sci ICT CAS, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Harbin Inst Technol, Shenzhen, Peoples R China
来源
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Regularization methods applying input perturbation have drawn considerable attention and have been frequently explored for NMT tasks in recent years. Despite their simplicity and effectiveness, we argue that these methods are limited by the under-fitting of training data. In this paper, we utilize prediction difference for ground-truth tokens to analyze the fitting of token-level samples and find that under-fitting is almost as common as over-fitting. We introduce prediction difference regularization (PD-R), a simple and effective method that can reduce over-fitting and under-fitting at the same time. For all token-level samples, PD-R minimizes the prediction difference between the original pass and the input-perturbed pass, making the model less sensitive to small input changes, thus more robust to both perturbations and under-fitted training data. Experiments on three widely used WMT translation tasks show that our approach can significantly improve over existing perturbation regularization methods. On WMT16 En-De task, our model achieves 1.80 SacreBLEU improvement over vanilla transformer.
引用
收藏
页码:7665 / 7675
页数:11
相关论文
共 50 条
  • [41] Generative Neural Machine Translation
    Shah, Harshil
    Barber, David
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [42] EXPLICITATION IN NEURAL MACHINE TRANSLATION
    Krueger, Ralph
    ACROSS LANGUAGES AND CULTURES, 2020, 21 (02) : 195 - 216
  • [43] Neural machine translation: A review
    Stahlberg F.
    Stahlberg, Felix (fs439@cantab.ac.uk), 1600, AI Access Foundation (69): : 343 - 418
  • [44] Neural machine translation for Hungarian
    Laki, Laszlo Janos
    Yang, Zijian Gyozo
    ACTA LINGUISTICA ACADEMICA, 2022, 69 (04): : 501 - 520
  • [45] Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments
    Ucak, Umit, V
    Ashyrmamatov, Islambek
    Ko, Junsu
    Lee, Juyong
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [46] Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments
    Umit V. Ucak
    Islambek Ashyrmamatov
    Junsu Ko
    Juyong Lee
    Nature Communications, 13
  • [47] A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation
    Sarizadeh, Sahar
    Dolamic, Ljiljana
    Frossard, Pascal
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1160 - 1177
  • [48] Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
    Ruiz, Nicholas
    Di Gangi, Mattia Antonino
    Bertoldi, Nicola
    Federico, Marcello
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2635 - 2639
  • [49] Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
    Wang, Xing
    Tu, Zhaopeng
    Zhang, Min
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2255 - 2266
  • [50] Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation
    Dugonik, Jani
    Maucec, Mirjam Sepesy
    Verber, Domen
    Brest, Janez
    MATHEMATICS, 2023, 11 (11)