Generalizing Back-Translation in Neural Machine Translation

被引:0
|
作者
Graca, Miguel [1 ,3 ]
Kim, Yunsu [1 ]
Schamper, Julian [1 ,3 ]
Khadivi, Shahram [2 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Human Language Technol & Pattern Recognit Grp, Aachen, Germany
[2] eBay Inc, Aachen, Germany
[3] DeepL GmbH, Cologne, Germany
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of cross-entropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental problems of the sampling-based approaches and propose to remedy them by (i) disabling label smoothing for the target-to-source model and (ii) sampling from a restricted search space. Our statements are investigated on the WMT 2018 German <-> English news translation task.
引用
收藏
页码:45 / 52
页数:8
相关论文
共 50 条
  • [1] Iterative Back-Translation for Neural Machine Translation
    Vu Cong Duy Hoang
    Koehn, Philipp
    Haffari, Gholamreza
    Cohn, Trevor
    [J]. NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 18 - 24
  • [2] Back-Translation Sampling by Targeting DifficultWords in Neural Machine Translation
    Fadaee, Marzieh
    Monz, Christof
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 436 - 446
  • [3] Neural Machine Translation Based on Back-Translation for Multilingual Translation Evaluation Task
    Lai, Siyu
    Yang, Yueting
    Xu, Jin'an
    Chen, Yufeng
    Huang, Hui
    [J]. MACHINE TRANSLATION, CCMT 2020, 2020, 1328 : 132 - 141
  • [4] Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation
    Wu, Jiawei
    Wang, Xin
    Wang, William Yang
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1173 - 1183
  • [5] On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation
    Liu, Xuebo
    Wang, Longyue
    Wong, Derek F.
    Ding, Liang
    Chao, Lidia S.
    Shi, Shuming
    Tu, Zhaopeng
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2900 - 2907
  • [6] On The Evaluation of Machine Translation Systems Trained With Back-Translation
    Edunov, Sergey
    Ott, Myle
    Ranzato, Marc'Aurelio
    Auli, Michael
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2836 - 2846
  • [7] Back-translation in Translation Teaching
    刘聪
    [J]. 读与写(教育教学刊), 2018, 15 (10) : 3 - 3
  • [8] A Joint Back-Translation and Transfer Learning Method for Low-Resource Neural Machine Translation
    Luo, Gong-Xu
    Yang, Ya-Ting
    Dong, Rui
    Chen, Yan-Hong
    Zhang, Wen-Bo
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [9] Enhancement of English-Bengali Machine Translation Leveraging Back-Translation
    Mondal, Subrota Kumar
    Wang, Chengwei
    Chen, Yijun
    Cheng, Yuning
    Huang, Yanbo
    Dai, Hong-Ning
    Kabir, H. M. Dipu
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (15):
  • [10] Evaluation of the Validity of Back-Translation as a Method of Assessing the Accuracy of Machine Translation
    Miyabe, Mai
    Yoshino, Takashi
    [J]. 2015 INTERNATIONAL CONFERENCE ON CULTURE AND COMPUTING (CULTURE COMPUTING), 2015, : 145 - 150