Neural Machine Translation With GRU-Gated Attention Model

被引:81
|
作者
Zhang, Biao [1 ]
Xiong, Deyi [2 ]
Xie, Jun [3 ]
Su, Jinsong [1 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen 361005, Peoples R China
[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[3] Tecent Co, Beijing 100080, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Decoding; Trade agreements; Logic gates; Adaptation models; NIST; Task analysis; Gated recurrent unit (GRU); gated attention model (GAtt); neural machine translation (NMT);
D O I
10.1109/TNNLS.2019.2957276
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural machine translation (NMT) heavily relies on context vectors generated by an attention network to predict target words. In practice, we observe that the context vectors for different target words are quite similar to one another and translations with such nondiscriminatory context vectors tend to be degenerative. We ascribe this similarity to the invariant source representations that lack dynamics across decoding steps. In this article, we propose a novel gated recurrent unit (GRU)gated attention model (GAtt) for NMT. By updating the source representations with the previous decoder state via a GRU, GAtt enables translation-sensitive source representations that then contribute to discriminative context vectors. We further propose a variant of GAtt by swapping the input order of the source representations and the previous decoder state to the GRU. Experiments on the NIST Chinese-English, WMT14 EnglishGerman, and WMT17 English-German translation tasks show that the two GAtt models achieve significant improvements over the vanilla attention-based NMT. Further analyses on the attention weights and context vectors demonstrate the effectiveness of GAtt in enhancing the discriminating capacity of representations and handling the challenging issue of overtranslation.
引用
收藏
页码:4688 / 4698
页数:11
相关论文
共 50 条
  • [1] Neural Machine Translation with Target-Attention Model
    Yang, Mingming
    Zhang, Min
    Chen, Kehai
    Wang, Rui
    Zhao, Tiejun
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (03) : 684 - 694
  • [2] Encoding Gated Translation Memory into Neural Machine Translation
    Cao, Qian
    Xiong, Deyi
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3042 - 3047
  • [3] Look Harder: A Neural Machine Translation Model with Hard Attention
    Indurthi, Sathish
    Chung, Insoo
    Kim, Sangha
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3037 - 3043
  • [4] Implementing Neural Machine Translation with Bi-Directional GRU and Attention Mechanism on FPGAs Using HLS
    Li, Qin
    Zhang, Xiaofan
    Xiong, JinJun
    Hwu, Wen-mei
    Chen, Deming
    [J]. 24TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2019), 2019, : 693 - 698
  • [5] A Visual Attention Grounding Neural Model for Multimodal Machine Translation
    Zhou, Mingyang
    Cheng, Runxiang
    Lee, Yong Jae
    Yu, Zhou
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3643 - 3653
  • [6] Recurrent Attention for Neural Machine Translation
    Zeng, Jiali
    Wu, Shuangzhi
    Yin, Yongjing
    Jiang, Yufan
    Li, Mu
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3216 - 3225
  • [7] Neural Machine Translation with Deep Attention
    Zhang, Biao
    Xiong, Deyi
    Su, Jinsong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (01) : 154 - 163
  • [8] Attention-via-Attention Neural Machine Translation
    Zhao, Shenjian
    Zhang, Zhihua
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570
  • [9] Neural Machine Translation with CARU-Embedding Layer and CARU-Gated Attention Layer
    Im, Sio-Kei
    Chan, Ka-Hou
    [J]. MATHEMATICS, 2024, 12 (07)
  • [10] Improving neural machine translation using gated state network and focal adaptive attention networtk
    Huang, Li
    Chen, Wenyu
    Liu, Yuguo
    Zhang, He
    Qu, Hong
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (23): : 15955 - 15967