Prediction Difference Regularization against Perturbation for Neural Machine Translation

被引:0
|
作者
Guo, Dengji [1 ,2 ]
Ma, Zhengrui [1 ,2 ]
Zhang, Min [3 ]
Feng, Yang [1 ,2 ]
机构
[1] Chinese Acad Sci ICT CAS, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Harbin Inst Technol, Shenzhen, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Regularization methods applying input perturbation have drawn considerable attention and have been frequently explored for NMT tasks in recent years. Despite their simplicity and effectiveness, we argue that these methods are limited by the under-fitting of training data. In this paper, we utilize prediction difference for ground-truth tokens to analyze the fitting of token-level samples and find that under-fitting is almost as common as over-fitting. We introduce prediction difference regularization (PD-R), a simple and effective method that can reduce over-fitting and under-fitting at the same time. For all token-level samples, PD-R minimizes the prediction difference between the original pass and the input-perturbed pass, making the model less sensitive to small input changes, thus more robust to both perturbations and under-fitted training data. Experiments on three widely used WMT translation tasks show that our approach can significantly improve over existing perturbation regularization methods. On WMT16 En-De task, our model achieves 1.80 SacreBLEU improvement over vanilla transformer.
引用
收藏
页码:7665 / 7675
页数:11
相关论文
共 50 条
  • [1] Effective Adversarial Regularization for Neural Machine Translation
    Sato, Motoki
    Suzuki, Jun
    Kiyono, Shun
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 204 - 210
  • [2] Attention With Sparsity Regularization for Neural Machine Translation and Summarization
    Zhang, Jiajun
    Zhao, Yang
    Li, Haoran
    Zong, Chengqing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 507 - 518
  • [3] Adversarial Subword Regularization for Robust Neural Machine Translation
    Park, Jungsoo
    Sung, Mujeen
    Lee, Jinhyuk
    Kang, Jaewoo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1945 - 1953
  • [4] Unsupervised Neural Machine Translation with SMT as Posterior Regularization
    Ren, Shuo
    Zhang, Zhirui
    Liu, Shujie
    Zhou, Ming
    Ma, Shuai
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 241 - 248
  • [5] Progressive and Consistent Subword Regularization for Neural Machine Translation
    Gao, Yongqi
    Luo, Yingfeng
    Zhang, Qinghong
    Sh, Huibo
    Xiao, Tong
    Zhu, Jingbo
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 314 - 326
  • [6] Robust Neural Machine Translation for Abugidas by Glyph Perturbation
    Kaing, Hour
    Ding, Chenchen
    Tanaka, Hideki
    Utiyama, Masao
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2: SHORT PAPERS, 2024, : 311 - 318
  • [7] ReWE: RegressingWord Embeddings for Regularization of Neural Machine Translation Systems
    Unanue, Inigo Jauregi
    Borzeshi, Ehsan Zare
    Esmaili, Nazanin
    Piccardil, Massimo
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 430 - 436
  • [8] Prediction Improves Simultaneous Neural Machine Translation
    Alinejad, Ashkan
    Siahbani, Maryam
    Sarkar, Anoop
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3022 - 3027
  • [9] INMT: Interactive Neural Machine Translation Prediction
    Santy, Sebastin
    Dandapat, Sandipan
    Choudhury, Monojit
    Bali, Kalika
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2019, : 103 - 108
  • [10] Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization
    Zhang, Jiacheng
    Liu, Yang
    Luan, Huanbo
    Xu, Jingfang
    Sun, Maosong
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 1514 - 1523