Look Harder: A Neural Machine Translation Model with Hard Attention

被引：0

作者：

Indurthi, Sathish ^{[1
]}

Chung, Insoo ^{[1
]}

Kim, Sangha ^{[1
]}

机构：

[1] Samsung Res, Seoul, South Korea

来源：

57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Soft-attention based Neural Machine Translation (NMT) models have achieved promising results on several translation tasks. These models attend all the words in the source sequence for each target token, which makes them ineffective for long sequence translation. In this work, we propose a hard-attention based NMT model which selects a subset of source tokens for each target token to effectively handle long sequence translation. Due to the discrete nature of the hard-attention mechanism, we design a reinforcement learning algorithm coupled with reward shaping strategy to efficiently train it. Experimental results show that the proposed model performs better on long sequences and thereby achieves significant BLEU score improvement on English-German (EN-DE) and English-French (EN-FR) translation tasks compared to the soft-attention based NMT.

引用

页码：3037 / 3043

页数：7

共 50 条

[1] Look-Ahead Attention for Generation in Neural Machine Translation
Zhou, Long
Zhang, Jiajun
Zong, Chengqing
[J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 211 - 223
[2] Neural Machine Translation with Target-Attention Model
Yang, Mingming
Zhang, Min
Chen, Kehai
Wang, Rui
Zhao, Tiejun
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (03) : 684 - 694
[3] A Visual Attention Grounding Neural Model for Multimodal Machine Translation
Zhou, Mingyang
Cheng, Runxiang
Lee, Yong Jae
Yu, Zhou
[J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3643 - 3653
[4] Neural Machine Translation With GRU-Gated Attention Model
Zhang, Biao
Xiong, Deyi
Xie, Jun
Su, Jinsong
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
[5] Recurrent Attention for Neural Machine Translation
Zeng, Jiali
Wu, Shuangzhi
Yin, Yongjing
Jiang, Yufan
Li, Mu
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
[6] Neural Machine Translation with Deep Attention
Zhang, Biao
Xiong, Deyi
Su, Jinsong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
[7] Attention-via-Attention Neural Machine Translation
Zhao, Shenjian
Zhang, Zhihua
[J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570
[8] Sparse and Constrained Attention for Neural Machine Translation
Malaviya, Chaitanya
Ferreira, Pedro
Martins, Andre F. T.
[J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 370 - 376
[9] Bilingual attention based neural machine translation
Kang, Liyan
He, Shaojie
Wang, Mingxuan
Long, Fei
Su, Jinsong
[J]. APPLIED INTELLIGENCE, 2023, 53 (04) : 4302 - 4315
[10] Bilingual attention based neural machine translation
Liyan Kang
Shaojie He
Mingxuan Wang
Fei Long
Jinsong Su
[J]. Applied Intelligence, 2023, 53 : 4302 - 4315

← 1 2 3 4 5 →