Prediction Difference Regularization against Perturbation for Neural Machine Translation

被引：0

作者：

Guo, Dengji ^{[1
,2
]}

Ma, Zhengrui ^{[1
,2
]}

Zhang, Min ^{[3
]}

Feng, Yang ^{[1
,2
]}

机构：

[1] Chinese Acad Sci ICT CAS, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

[3] Harbin Inst Technol, Shenzhen, Peoples R China

来源：

PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) | 2022年

基金：

国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Regularization methods applying input perturbation have drawn considerable attention and have been frequently explored for NMT tasks in recent years. Despite their simplicity and effectiveness, we argue that these methods are limited by the under-fitting of training data. In this paper, we utilize prediction difference for ground-truth tokens to analyze the fitting of token-level samples and find that under-fitting is almost as common as over-fitting. We introduce prediction difference regularization (PD-R), a simple and effective method that can reduce over-fitting and under-fitting at the same time. For all token-level samples, PD-R minimizes the prediction difference between the original pass and the input-perturbed pass, making the model less sensitive to small input changes, thus more robust to both perturbations and under-fitted training data. Experiments on three widely used WMT translation tasks show that our approach can significantly improve over existing perturbation regularization methods. On WMT16 En-De task, our model achieves 1.80 SacreBLEU improvement over vanilla transformer.

引用

页码：7665 / 7675

页数：11

共 50 条

[41] Generative Neural Machine Translation
Shah, Harshil
Barber, David
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[42] EXPLICITATION IN NEURAL MACHINE TRANSLATION
Krueger, Ralph
ACROSS LANGUAGES AND CULTURES, 2020, 21 (02) : 195 - 216
[43] Neural machine translation: A review
Stahlberg F.
Stahlberg, Felix (fs439@cantab.ac.uk), 1600, AI Access Foundation (69): : 343 - 418
[44] Neural machine translation for Hungarian
Laki, Laszlo Janos
Yang, Zijian Gyozo
ACTA LINGUISTICA ACADEMICA, 2022, 69 (04): : 501 - 520
[45] Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments
Ucak, Umit, V
Ashyrmamatov, Islambek
Ko, Junsu
Lee, Juyong
NATURE COMMUNICATIONS, 2022, 13 (01)
[46] Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments
Umit V. Ucak
Islambek Ashyrmamatov
Junsu Ko
Juyong Lee
Nature Communications, 13
[47] A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation
Sarizadeh, Sahar
Dolamic, Ljiljana
Frossard, Pascal
PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1160 - 1177
[48] Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors
Ruiz, Nicholas
Di Gangi, Mattia Antonino
Bertoldi, Nicola
Federico, Marcello
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2635 - 2639
[49] Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
Wang, Xing
Tu, Zhaopeng
Zhang, Min
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2255 - 2266
[50] Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation
Dugonik, Jani
Maucec, Mirjam Sepesy
Verber, Domen
Brest, Janez
MATHEMATICS, 2023, 11 (11)

← 1 2 3 4 5 →