Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation

被引:0
|
作者
Mueller, Mathias [1 ]
Sennrich, Rico [1 ,2 ]
机构
[1] Univ Zurich, Dept Computat Linguist, Zurich, Switzerland
[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
基金
瑞士国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search - the de facto standard inference algorithm in NMT - and Eikema and Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift.(1)
引用
下载
收藏
页码:259 / 272
页数:14
相关论文
共 50 条
  • [21] Support vector machines for Segmental Minimum Bayes Risk decoding of continuous speech
    Venkataramani, V
    Chakrabartty, S
    Byrne, W
    ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 13 - 18
  • [22] Ginisupport vector machines for segmental minimum Bayes risk decoding of continuous speech
    Venkataramani, Veera
    Chakrabartty, Shantanu
    Byrne, William
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (03): : 423 - 442
  • [23] Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition
    Byrne, W
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 900 - 907
  • [24] Minimum Bayes Risk decoding and system combination based on a recursion for edit distance
    Xu, Haihua
    Povey, Daniel
    Mangu, Lidia
    Zhu, Jie
    COMPUTER SPEECH AND LANGUAGE, 2011, 25 (04): : 802 - 828
  • [25] Scheduled Sampling Based on Decoding Steps for Neural Machine Translation
    Liu, Yijin
    Meng, Fandong
    Chen, Yufeng
    Xu, Jinan
    Zhou, Jie
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3285 - 3296
  • [26] Regularizing Forward and Backward Decoding to Improve Neural Machine Translation
    Yang, Zhen
    Chen, Laifu
    Minh Le Nguyen
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2018, : 73 - 78
  • [27] Speeding Up Neural Machine Translation Decoding by Cube Pruning
    Zhang, Wen
    Huang, Liang
    Feng, Yang
    Shen, Lei
    Liu, Qun
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4284 - 4294
  • [28] Preventing translation quality deterioration caused by beam search decoding in neural machine translation using statistical machine translation
    Satir, Emre
    Bulut, Hasan
    INFORMATION SCIENCES, 2021, 581 : 791 - 807
  • [29] AN IMPROVED CONSENSUS-LIKE METHOD FOR MINIMUM BAYES RISK DECODING AND LATTICE COMBINATION
    Xu, Haihua
    Povey, Daniel
    Mangu, Lidia
    Zhu, Jie
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4938 - 4941
  • [30] Towards Understanding Neural Machine Translation with Word Importance
    He, Shilin
    Tu, Zhaopeng
    Wang, Xing
    Wang, Longyue
    Lyu, Michael R.
    Shi, Shuming
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 953 - 962