Minimum Risk Training for Neural Machine Translation

被引:0
|
作者
Shen, Shiqi [1 ]
Cheng, Yong [2 ]
He, Zhongjun [3 ]
He, Wei [3 ]
Wu, Hua [3 ]
Sun, Maosong [1 ]
Liu, Yang [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
[3] Baidu Inc, Beijing, Peoples R China
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose minimum risk training for end-to-end neural machine translation. Unlike conventional maximum likelihood estimation, minimum risk training is capable of optimizing model parameters directly with respect to arbitrary evaluation metrics, which are not necessarily differentiable. Experiments show that our approach achieves significant improvements over maximum likelihood estimation on a state-of-the-art neural machine translation system across various languages pairs. Transparent to architectures, our approach can be applied to more neural networks and potentially benefit more NLP tasks.
引用
下载
收藏
页码:1683 / 1692
页数:10
相关论文
共 50 条
  • [1] Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation
    Mueller, Mathias
    Sennrich, Rico
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 259 - 272
  • [2] Minimum error rate training in statistical machine translation
    Och, FJ
    41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 160 - 167
  • [3] Improving Neural Machine Translation by Bidirectional Training
    Ding, Liang
    Wu, Di
    Tao, Dacheng
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3278 - 3284
  • [4] Discriminant training of neural networks for machine translation
    Quoc-Khanh Do
    Allauzen, Alexandre
    Yvon, Francois
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2016, 57 (01): : 111 - 135
  • [5] Generative adversarial training for neural machine translation
    Yang, Zhen
    Chen, Wei
    Wang, Feng
    Xu, Bo
    NEUROCOMPUTING, 2018, 321 : 146 - 155
  • [6] Speed Up the Training of Neural Machine Translation
    Liu, Xinyue
    Wang, Weixuan
    Liang, Wenxin
    Li, Yuangang
    NEURAL PROCESSING LETTERS, 2020, 51 (01) : 231 - 249
  • [7] Speed Up the Training of Neural Machine Translation
    Xinyue Liu
    Weixuan Wang
    Wenxin Liang
    Yuangang Li
    Neural Processing Letters, 2020, 51 : 231 - 249
  • [8] Minimum Bayes' risk subsequence combination for machine translation
    Gonzalez-Rubio, Jesus
    Casacuberta, Francisco
    PATTERN ANALYSIS AND APPLICATIONS, 2015, 18 (03) : 523 - 533
  • [9] Minimum Bayes’ risk subsequence combination for machine translation
    Jesús González-Rubio
    Francisco Casacuberta
    Pattern Analysis and Applications, 2015, 18 : 523 - 533
  • [10] Training Neural Machine Translation To Apply Terminology Constraints
    Dinu, Georgiana
    Mathur, Prashant
    Federico, Marcello
    Al-Onaizan, Yaser
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3063 - 3068