Improving neural machine translation using gated state network and focal adaptive attention networtk

被引:0
|
作者
Huang, Li [1 ]
Chen, Wenyu [1 ]
Liu, Yuguo [1 ]
Zhang, He [2 ]
Qu, Hong [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
[2] Tencent Big Data Prod Ctr CSIG, Chengdu 610094, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 23期
基金
美国国家科学基金会;
关键词
Attention mechanism; Neural machine translation; Deep learning;
D O I
10.1007/s00521-021-06444-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The currently predominant token-to-token attention mechanism has demonstrated its ability to capture word dependencies in neural machine translation. This mechanism treats a sequence as bag-of-words tokens and compute the similarity between tokens without considering their intrinsic interactions. In this paper, we argue that this attention mechanism may miss opportunity of take advantage of the state information through multiple time steps. Thus, we propose a Gated State Network which manipulates the state information flow with sequential characteristics. We also incorporate a Focal Adaptive Attention Network which utilizes a Gaussian distribution to concentrate the attention distribution to a predicted focal position and its neighborhood. Experimental results on WMT'14 English-German and WMT'17 Chinese-English translation tasks demonstrate the effectiveness of the proposed approach.
引用
收藏
页码:15955 / 15967
页数:13
相关论文
共 50 条
  • [1] Improving neural machine translation using gated state network and focal adaptive attention networtk
    Huang, Li
    Chen, Wenyu
    Liu, Yuguo
    Zhang, He
    Qu, Hong
    [J]. Neural Computing and Applications, 2021, 33 (23) : 15955 - 15967
  • [2] Improving neural machine translation using gated state network and focal adaptive attention networtk
    Li Huang
    Wenyu Chen
    Yuguo Liu
    He Zhang
    Hong Qu
    [J]. Neural Computing and Applications, 2021, 33 : 15955 - 15967
  • [3] Neural Machine Translation With GRU-Gated Attention Model
    Zhang, Biao
    Xiong, Deyi
    Xie, Jun
    Su, Jinsong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
  • [4] Measuring and Improving Faithfulness of Attention in Neural Machine Translation
    Moradi, Pooya
    Kambhatla, Nishant
    Sarkar, Anoop
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2791 - 2802
  • [5] Improving Neural Machine Translation Using Rule-Based Machine Translation
    Singh, Muskaan
    Kumar, Ravinder
    Chana, Inderveer
    [J]. 2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 8 - 12
  • [6] English-Afaan Oromo Machine Translation Using Deep Attention Neural Network
    Gemechu, EbisaA
    Kanagachidambaresan, G. R.
    [J]. OPTICAL MEMORY AND NEURAL NETWORKS, 2023, 32 (03) : 159 - 168
  • [7] English-Afaan Oromo Machine Translation Using Deep Attention Neural Network
    G. R. Ebisa A. Gemechu
    [J]. Optical Memory and Neural Networks, 2023, 32 : 159 - 168
  • [8] Encoding Gated Translation Memory into Neural Machine Translation
    Cao, Qian
    Xiong, Deyi
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3042 - 3047
  • [9] Recurrent Attention for Neural Machine Translation
    Zeng, Jiali
    Wu, Shuangzhi
    Yin, Yongjing
    Jiang, Yufan
    Li, Mu
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
  • [10] Neural Machine Translation with Deep Attention
    Zhang, Biao
    Xiong, Deyi
    Su, Jinsong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163