Towards a Better Understanding of Label Smoothing in Neural Machine Translation

被引:0
|
作者
Gao, Yingbo [1 ]
Wang, Weiyue [1 ]
Herold, Christian [1 ]
Yang, Zijian [1 ]
Ney, Hermann [1 ]
机构
[1] Rhein Westfal TH Aachen, Comp Sci Dept, Human Language Technol & Pattern Recognit Grp, D-52056 Aachen, Germany
来源
1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020) | 2020年
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to combat overfitting and in pursuit of better generalization, label smoothing is widely applied in modern neural machine translation systems. The core idea is to penalize over-confident outputs and regularize the model so that its outputs do not diverge too much from some prior distribution. While training perplexity generally gets worse, label smoothing is found to consistently improve test performance. In this work, we aim to better understand label smoothing in the context of neural machine translation. Theoretically, we derive and explain exactly what label smoothing is optimizing for. Practically, we conduct extensive experiments by varying which tokens to smooth, tuning the probability mass to be deducted from the true targets and considering different prior distributions. We show that label smoothing is theoretically well-motivated, and by carefully choosing hyperparameters, the practical performance of strong neural machine translation systems can be further improved.
引用
收藏
页码:212 / 223
页数:12
相关论文
共 50 条
  • [31] Towards a better understanding
    Howarth, I
    ASTRONOMY & GEOPHYSICS, 2003, 44 (05) : 7 - 7
  • [32] TOWARDS BETTER UNDERSTANDING
    CLARKE, R
    MILK INDUSTRY, 1980, 82 (04): : 37 - 39
  • [33] A Novel Label Smoothing Technique for Machine Degradation
    Chao, Ko-Chieh
    Shih, Yu
    Lee, Ching-Hung
    IFAC PAPERSONLINE, 2023, 56 (02): : 4430 - +
  • [34] Towards Better Understanding of Cybercrime: The Role of Fine-Tuned LLMs in Translation
    Valeros, Veronica
    Sirokova, Anna
    Catania, Carlos
    Garcia, Sebastian
    9TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS, EUROS&PW 2024, 2024, : 91 - 99
  • [35] Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
    Xu, Weijia
    Agrawal, Sweta
    Briakou, Eleftheria
    Martindale, Marianna J.
    Carpuat, Marine
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 546 - 564
  • [36] Upping the Ante: Towards a Better Benchmark for Chinese-to-English Machine Translation
    Hadiwinoto, Christian
    Ng, Hwee Tou
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 16 - 23
  • [37] Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation
    Mueller, Mathias
    Sennrich, Rico
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 259 - 272
  • [38] Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation
    Wang, Wenxuan
    Jiao, Wenxiang
    Hao, Yongchang
    Wang, Xing
    Shi, Shuming
    Tu, Zhaopeng
    Lyu, Michael R.
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 2591 - 2600
  • [39] NMTSloth: Understanding and Testing Efficiency Degradation of Neural Machine Translation Systems
    Chen, Simin
    Liu, Cong
    Haque, Mirazul
    Song, Zihe
    Yang, Wei
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1148 - 1160
  • [40] The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes
    Fort, Stanislav
    Scherlis, Adam
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3574 - 3581