Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation

被引：0

作者：

Mueller, Mathias ^{[1
]}

Sennrich, Rico ^{[1
,2
]}

机构：

[1] Univ Zurich, Dept Computat Linguist, Zurich, Switzerland

[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland

来源：

59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021) | 2021年

基金：

瑞士国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search - the de facto standard inference algorithm in NMT - and Eikema and Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift.(1)

引用

下载

页码：259 / 272

页数：14

共 50 条

[21] Support vector machines for Segmental Minimum Bayes Risk decoding of continuous speech
Venkataramani, V
Chakrabartty, S
Byrne, W
ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 13 - 18
[22] Ginisupport vector machines for segmental minimum Bayes risk decoding of continuous speech
Venkataramani, Veera
Chakrabartty, Shantanu
Byrne, William
COMPUTER SPEECH AND LANGUAGE, 2007, 21 (03): : 423 - 442
[23] Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition
Byrne, W
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 900 - 907
[24] Minimum Bayes Risk decoding and system combination based on a recursion for edit distance
Xu, Haihua
Povey, Daniel
Mangu, Lidia
Zhu, Jie
COMPUTER SPEECH AND LANGUAGE, 2011, 25 (04): : 802 - 828
[25] Scheduled Sampling Based on Decoding Steps for Neural Machine Translation
Liu, Yijin
Meng, Fandong
Chen, Yufeng
Xu, Jinan
Zhou, Jie
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3285 - 3296
[26] Regularizing Forward and Backward Decoding to Improve Neural Machine Translation
Yang, Zhen
Chen, Laifu
Minh Le Nguyen
PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2018, : 73 - 78
[27] Speeding Up Neural Machine Translation Decoding by Cube Pruning
Zhang, Wen
Huang, Liang
Feng, Yang
Shen, Lei
Liu, Qun
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4284 - 4294
[28] Preventing translation quality deterioration caused by beam search decoding in neural machine translation using statistical machine translation
Satir, Emre
Bulut, Hasan
INFORMATION SCIENCES, 2021, 581 : 791 - 807
[29] AN IMPROVED CONSENSUS-LIKE METHOD FOR MINIMUM BAYES RISK DECODING AND LATTICE COMBINATION
Xu, Haihua
Povey, Daniel
Mangu, Lidia
Zhu, Jie
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4938 - 4941
[30] Towards Understanding Neural Machine Translation with Word Importance
He, Shilin
Tu, Zhaopeng
Wang, Xing
Wang, Longyue
Lyu, Michael R.
Shi, Shuming
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 953 - 962

← 1 2 3 4 5 →