Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation

被引：0

作者：

Mueller, Mathias ^{[1
]}

Sennrich, Rico ^{[1
,2
]}

机构：

[1] Univ Zurich, Dept Computat Linguist, Zurich, Switzerland

[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland

来源：

59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021) | 2021年

基金：

瑞士国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search - the de facto standard inference algorithm in NMT - and Eikema and Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift.(1)

引用

下载

页码：259 / 272

页数：14

共 50 条

[41] Efficient Embedded Decoding of Neural Network Language Models in a Machine Translation System
Zamora-Martinez, Francisco
Jose Castro-Bleda, Maria
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2018, 28 (09)
[42] Coarse-to-Fine Output Predictions for Efficient Decoding in Neural Machine Translation
Chen, Qi
Kwong, Oi Yee
Li, Yinqiao
Xiao, Tong
Zhu, Jingbo
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (06)
[43] UNDERSTANDING MACHINE TRANSLATION
Varga, Agnes
IDIMT-2006, 2006, 19 : 285 - 296
[44] Fast and optimal decoding for machine translation
Germann, U
Jahr, M
Knight, K
Marcu, D
Yamada, K
ARTIFICIAL INTELLIGENCE, 2004, 154 (1-2) : 127 - 143
[45] Decoding algorithm in statistical machine translation
Wang, YY
Waibel, A
35TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 8TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 1997, : 366 - 372
[46] Mixup Decoding for Diverse Machine Translation
Li, Jicheng
Gao, Pengzhi
Wu, Xuanfu
Feng, Yang
He, Zhongjun
Wu, Hua
Wang, Haifeng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 312 - 320
[47] Empirical Minimum Bayes Risk Prediction
Premachandran, Vittal
Tarlow, Daniel
Yuille, Alan L.
Batra, Dhruv
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (01) : 75 - 86
[48] Investigation of Improved Approaches to Bayes Risk Decoding
徐海华
朱杰
Journal of Shanghai Jiaotong University(Science), 2011, 16 (05) : 524 - 529
[49] Investigation of improved approaches to Bayes risk decoding
Xu H.-H.
Zhu J.
Journal of Shanghai Jiaotong University (Science), 2011, 16 (5) : 524 - 529
[50] Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary
Shi, Xing
Knight, Kevin
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 574 - 579

← 1 2 3 4 5 →