Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation

被引:0
|
作者
Moradi, Pooya [1 ]
Kambhatla, Nishant [1 ]
Sarkar, Anoop [1 ]
机构
[1] Simon Fraser Univ, 8888 Univ Dr, Burnaby, BC, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Can we trust that the attention heatmaps produced by a neural machine translation (NMT) model reflect its true internal reasoning? We isolate and examine in detail the notion of faithfulness in NMT models. We provide a measure of faithfulness for NMT based on a variety of stress tests where model parameters are perturbed and measuring faithfulness based on how often the model output changes. We show that our proposed faithfulness measure for NMT models can be improved using a novel differentiable objective that rewards faithful behaviour by the model through probability divergence. Our experimental results on multiple language pairs show that our objective function is effective in increasing faithfulness and can lead to a useful analysis of NMT model behaviour and more trustworthy attention heatmaps. Our proposed objective improves faithfulness without reducing the translation quality and it also seems to have a useful regularization effect on the NMT model and can even improve translation quality in some cases.
引用
收藏
页码:86 / 93
页数:8
相关论文
共 50 条
  • [41] Neural Machine Translation With GRU-Gated Attention Model
    Zhang, Biao
    Xiong, Deyi
    Xie, Jun
    Su, Jinsong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
  • [42] Regularizing Forward and Backward Decoding to Improve Neural Machine Translation
    Yang, Zhen
    Chen, Laifu
    Minh Le Nguyen
    [J]. PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2018, : 73 - 78
  • [43] Using Semantic Role Labeling to Improve Neural Machine Translation
    Rapp, Reinhard
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3079 - 3083
  • [44] The use of adversaries for optimal neural network training
    Hawthorne-Gonzalvez, Anton
    Sevior, Martin
    [J]. 23RD INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2018), 2019, 214
  • [45] The use of adversaries for optimal neural network training
    Hawthorne-Gonzalvez, Anton
    Sevior, Martin
    [J]. NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2019, 913 (54-64): : 54 - 64
  • [46] Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation
    Khayrallah, Huda
    Thompson, Brian
    Duh, Kevin
    Koehn, Philipp
    [J]. NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 36 - 44
  • [47] Adversarial Training for Unknown Word Problems in Neural Machine Translation
    Ji, Yatu
    Hou, Hongxu
    Chen, Junjie
    Wu, Nier
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
  • [48] Moment matching training for neural machine translation: An empirical study
    Nguyen, Long H. B.
    Pham, Nghi T.
    Duc, Le D. C.
    Cong Duy Vu Hoang
    Dien Dinh
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 2633 - 2645
  • [49] Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems
    Marie, Benjamin
    Fujita, Atsushi
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (05)
  • [50] From Bilingual to Multilingual Neural Machine Translation by Incremental Training
    Escolano, Carlos
    Costa-Jussa, Marta R.
    Fonollosa, Jose A. R.
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 236 - 242