Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation

被引：0

作者：

Moradi, Pooya ^{[1
]}

Kambhatla, Nishant ^{[1
]}

Sarkar, Anoop ^{[1
]}

机构：

[1] Simon Fraser Univ, 8888 Univ Dr, Burnaby, BC, Canada

来源：

AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Can we trust that the attention heatmaps produced by a neural machine translation (NMT) model reflect its true internal reasoning? We isolate and examine in detail the notion of faithfulness in NMT models. We provide a measure of faithfulness for NMT based on a variety of stress tests where model parameters are perturbed and measuring faithfulness based on how often the model output changes. We show that our proposed faithfulness measure for NMT models can be improved using a novel differentiable objective that rewards faithful behaviour by the model through probability divergence. Our experimental results on multiple language pairs show that our objective function is effective in increasing faithfulness and can lead to a useful analysis of NMT model behaviour and more trustworthy attention heatmaps. Our proposed objective improves faithfulness without reducing the translation quality and it also seems to have a useful regularization effect on the NMT model and can even improve translation quality in some cases.

引用

页码：86 / 93

页数：8

共 50 条

[41] Neural Machine Translation With GRU-Gated Attention Model
Zhang, Biao
Xiong, Deyi
Xie, Jun
Su, Jinsong
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
[42] Regularizing Forward and Backward Decoding to Improve Neural Machine Translation
Yang, Zhen
Chen, Laifu
Minh Le Nguyen
[J]. PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2018, : 73 - 78
[43] Using Semantic Role Labeling to Improve Neural Machine Translation
Rapp, Reinhard
[J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3079 - 3083
[44] The use of adversaries for optimal neural network training
Hawthorne-Gonzalvez, Anton
Sevior, Martin
[J]. 23RD INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2018), 2019, 214
[45] The use of adversaries for optimal neural network training
Hawthorne-Gonzalvez, Anton
Sevior, Martin
[J]. NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2019, 913 (54-64): : 54 - 64
[46] Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation
Khayrallah, Huda
Thompson, Brian
Duh, Kevin
Koehn, Philipp
[J]. NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 36 - 44
[47] Adversarial Training for Unknown Word Problems in Neural Machine Translation
Ji, Yatu
Hou, Hongxu
Chen, Junjie
Wu, Nier
[J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (01)
[48] Moment matching training for neural machine translation: An empirical study
Nguyen, Long H. B.
Pham, Nghi T.
Duc, Le D. C.
Cong Duy Vu Hoang
Dien Dinh
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (03) : 2633 - 2645
[49] Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems
Marie, Benjamin
Fujita, Atsushi
[J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (05)
[50] From Bilingual to Multilingual Neural Machine Translation by Incremental Training
Escolano, Carlos
Costa-Jussa, Marta R.
Fonollosa, Jose A. R.
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 236 - 242

← 1 2 3 4 5 →