Minimum Risk Training for Neural Machine Translation

被引:0
|
作者
Shen, Shiqi [1 ]
Cheng, Yong [2 ]
He, Zhongjun [3 ]
He, Wei [3 ]
Wu, Hua [3 ]
Sun, Maosong [1 ]
Liu, Yang [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
[3] Baidu Inc, Beijing, Peoples R China
基金
中国国家自然科学基金; 新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose minimum risk training for end-to-end neural machine translation. Unlike conventional maximum likelihood estimation, minimum risk training is capable of optimizing model parameters directly with respect to arbitrary evaluation metrics, which are not necessarily differentiable. Experiments show that our approach achieves significant improvements over maximum likelihood estimation on a state-of-the-art neural machine translation system across various languages pairs. Transparent to architectures, our approach can be applied to more neural networks and potentially benefit more NLP tasks.
引用
下载
收藏
页码:1683 / 1692
页数:10
相关论文
共 50 条
  • [11] Shallow-to-Deep Training for Neural Machine Translation
    Li, Bei
    Wang, Ziyang
    Liu, Hui
    Jiang, Yufan
    Du, Quan
    Xiao, Tong
    Wang, Huizhen
    Zhu, Jingbo
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 995 - 1005
  • [12] Pre-training Methods for Neural Machine Translation
    Wang, Mingxuan
    Li, Lei
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: TUTORIAL ABSTRACTS, 2021, : 21 - 25
  • [13] Restricted or Not: A General Training Framework for Neural Machine Translation
    Li, Zuchao
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Hai
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): STUDENT RESEARCH WORKSHOP, 2022, : 245 - 251
  • [14] Forced decoding for minimum error rate training in statistical machine translation
    Liang, Huashen
    Zhang, Min
    Zhao, Tiejun
    Journal of Computational Information Systems, 2012, 8 (02): : 861 - 868
  • [15] Minimum Bayes-risk decoding for statistical machine translation
    Kumar, S
    Byrne, W
    HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2004, : 169 - 176
  • [16] Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation
    Khayrallah, Huda
    Thompson, Brian
    Duh, Kevin
    Koehn, Philipp
    NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 36 - 44
  • [17] Adversarial Training for Unknown Word Problems in Neural Machine Translation
    Ji, Yatu
    Hou, Hongxu
    Chen, Junjie
    Wu, Nier
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
  • [18] Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems
    Marie, Benjamin
    Fujita, Atsushi
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (05)
  • [19] Moment matching training for neural machine translation: An empirical study
    Nguyen, Long H. B.
    Pham, Nghi T.
    Duc, Le D. C.
    Cong Duy Vu Hoang
    Dien Dinh
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 2633 - 2645
  • [20] Joint Training for Pivot-based Neural Machine Translation
    Cheng, Yong
    Yang, Qian
    Liu, Yang
    Sun, Maosong
    Xu, Wei
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3974 - 3980