Improving neural machine translation using gated state network and focal adaptive attention networtk

被引：0

作者：

Huang, Li ^{[1
]}

Chen, Wenyu ^{[1
]}

Liu, Yuguo ^{[1
]}

Zhang, He ^{[2
]}

Qu, Hong ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China

[2] Tencent Big Data Prod Ctr CSIG, Chengdu 610094, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 23期

基金：

美国国家科学基金会;

关键词：

Attention mechanism; Neural machine translation; Deep learning;

D O I：

10.1007/s00521-021-06444-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The currently predominant token-to-token attention mechanism has demonstrated its ability to capture word dependencies in neural machine translation. This mechanism treats a sequence as bag-of-words tokens and compute the similarity between tokens without considering their intrinsic interactions. In this paper, we argue that this attention mechanism may miss opportunity of take advantage of the state information through multiple time steps. Thus, we propose a Gated State Network which manipulates the state information flow with sequential characteristics. We also incorporate a Focal Adaptive Attention Network which utilizes a Gaussian distribution to concentrate the attention distribution to a predicted focal position and its neighborhood. Experimental results on WMT'14 English-German and WMT'17 Chinese-English translation tasks demonstrate the effectiveness of the proposed approach.

引用

页码：15955 / 15967

页数：13

共 50 条

[1] Improving neural machine translation using gated state network and focal adaptive attention networtk
Huang, Li
Chen, Wenyu
Liu, Yuguo
Zhang, He
Qu, Hong
[J]. Neural Computing and Applications, 2021, 33 (23) : 15955 - 15967
[2] Improving neural machine translation using gated state network and focal adaptive attention networtk
Li Huang
Wenyu Chen
Yuguo Liu
He Zhang
Hong Qu
[J]. Neural Computing and Applications, 2021, 33 : 15955 - 15967
[3] Neural Machine Translation With GRU-Gated Attention Model
Zhang, Biao
Xiong, Deyi
Xie, Jun
Su, Jinsong
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
[4] Measuring and Improving Faithfulness of Attention in Neural Machine Translation
Moradi, Pooya
Kambhatla, Nishant
Sarkar, Anoop
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2791 - 2802
[5] Improving Neural Machine Translation Using Rule-Based Machine Translation
Singh, Muskaan
Kumar, Ravinder
Chana, Inderveer
[J]. 2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 8 - 12
[6] English-Afaan Oromo Machine Translation Using Deep Attention Neural Network
Gemechu, EbisaA
Kanagachidambaresan, G. R.
[J]. OPTICAL MEMORY AND NEURAL NETWORKS, 2023, 32 (03) : 159 - 168
[7] English-Afaan Oromo Machine Translation Using Deep Attention Neural Network
G. R. Ebisa A. Gemechu
[J]. Optical Memory and Neural Networks, 2023, 32 : 159 - 168
[8] Encoding Gated Translation Memory into Neural Machine Translation
Cao, Qian
Xiong, Deyi
[J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3042 - 3047
[9] Recurrent Attention for Neural Machine Translation
Zeng, Jiali
Wu, Shuangzhi
Yin, Yongjing
Jiang, Yufan
Li, Mu
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
[10] Neural Machine Translation with Deep Attention
Zhang, Biao
Xiong, Deyi
Su, Jinsong
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163

← 1 2 3 4 5 →