Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation

被引:0
|
作者
Mueller, Mathias [1 ]
Sennrich, Rico [1 ,2 ]
机构
[1] Univ Zurich, Dept Computat Linguist, Zurich, Switzerland
[2] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
基金
瑞士国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words, and shows poor robustness to copy noise in training data or domain shift. Recent work has tied these shortcomings to beam search - the de facto standard inference algorithm in NMT - and Eikema and Aziz (2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead. In this paper, we empirically investigate the properties of MBR decoding on a number of previously reported biases and failure cases of beam search. We find that MBR still exhibits a length and token frequency bias, owing to the MT metrics used as utility functions, but that MBR also increases robustness against copy noise in the training data and domain shift.(1)
引用
收藏
页码:259 / 272
页数:14
相关论文
共 50 条
  • [1] Minimum Bayes-risk decoding for statistical machine translation
    Kumar, S
    Byrne, W
    [J]. HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2004, : 169 - 176
  • [2] Minimum Bayes' risk subsequence combination for machine translation
    Gonzalez-Rubio, Jesus
    Casacuberta, Francisco
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2015, 18 (03) : 523 - 533
  • [3] Minimum Bayes’ risk subsequence combination for machine translation
    Jesús González-Rubio
    Francisco Casacuberta
    [J]. Pattern Analysis and Applications, 2015, 18 : 523 - 533
  • [4] Minimum Risk Training for Neural Machine Translation
    Shen, Shiqi
    Cheng, Yong
    He, Zhongjun
    He, Wei
    Wu, Hua
    Sun, Maosong
    Liu, Yang
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1683 - 1692
  • [5] Discriminative training for segmental Minimum Bayes Risk decoding
    Doumpiotis, V
    Tsakalidis, S
    Byrne, W
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 136 - 139
  • [6] Asynchronous Bidirectional Decoding for Neural Machine Translation
    Zhang, Xiangwen
    Su, Jinsong
    Qin, Yue
    Liu, Yang
    Ji, Rongrong
    Wang, Hongji
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5698 - 5705
  • [7] Decoding with Value Networks for Neural Machine Translation
    He, Di
    Lu, Hanqing
    Xia, Yingce
    Qin, Tao
    Wang, Liwei
    Liu, Tie-Yan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [8] Centroid-Based Efficient Minimum Bayes Risk Decoding
    Deguchi, Hiroyuki
    Sakai, Yusuke
    Kamigaito, Hidetaka
    Watanabe, Taro
    Tanaka, Hideki
    Utiyama, Masao
    [J]. Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2024, : 11009 - 11018
  • [9] High Quality Rather than High Model Probability: Minimum Bayes Risk Decoding with Neural Metrics
    Freitag, Markus
    Grangier, David
    Tan, Qijun
    Liang, Bowen
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2022, 10 : 811 - 825
  • [10] Visualizing and Understanding Neural Machine Translation
    Ding, Yanzhuo
    Liu, Yang
    Luan, Huanbo
    Sun, Maosong
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1150 - 1159