Neural Machine Translation With GRU-Gated Attention Model

被引:81
|
作者
Zhang, Biao [1 ]
Xiong, Deyi [2 ]
Xie, Jun [3 ]
Su, Jinsong [1 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen 361005, Peoples R China
[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
[3] Tecent Co, Beijing 100080, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Decoding; Trade agreements; Logic gates; Adaptation models; NIST; Task analysis; Gated recurrent unit (GRU); gated attention model (GAtt); neural machine translation (NMT);
D O I
10.1109/TNNLS.2019.2957276
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural machine translation (NMT) heavily relies on context vectors generated by an attention network to predict target words. In practice, we observe that the context vectors for different target words are quite similar to one another and translations with such nondiscriminatory context vectors tend to be degenerative. We ascribe this similarity to the invariant source representations that lack dynamics across decoding steps. In this article, we propose a novel gated recurrent unit (GRU)gated attention model (GAtt) for NMT. By updating the source representations with the previous decoder state via a GRU, GAtt enables translation-sensitive source representations that then contribute to discriminative context vectors. We further propose a variant of GAtt by swapping the input order of the source representations and the previous decoder state to the GRU. Experiments on the NIST Chinese-English, WMT14 EnglishGerman, and WMT17 English-German translation tasks show that the two GAtt models achieve significant improvements over the vanilla attention-based NMT. Further analyses on the attention weights and context vectors demonstrate the effectiveness of GAtt in enhancing the discriminating capacity of representations and handling the challenging issue of overtranslation.
引用
收藏
页码:4688 / 4698
页数:11
相关论文
共 50 条
  • [41] Hybrid Attention for Chinese Character-Level Neural Machine Translation
    Wang, Feng
    Chen, Wei
    Yang, Zhen
    Xu, Shuang
    Xu, Bo
    [J]. NEUROCOMPUTING, 2019, 358 : 44 - 52
  • [42] Neural Machine Translation Models with Attention-Based Dropout Layer
    Israr, Huma
    Khan, Safdar Abbas
    Tahir, Muhammad Ali
    Shahzad, Muhammad Khuram
    Ahmad, Muneer
    Zain, Jasni Mohamad
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2981 - 3009
  • [43] Neural Machine Translation with Attention Based on a New Syntactic Branch Distance
    Peng, Ru
    Chen, Zhitao
    Hao, Tianyong
    Fang, Yi
    [J]. MACHINE TRANSLATION, CCMT 2019, 2019, 1104 : 47 - 57
  • [44] Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine Translation
    Behnke, Maximiliana
    Heafield, Kenneth
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2664 - 2674
  • [45] Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
    Kuang, Shaohui
    Li, Junhui
    Branco, Antonio
    Luo, Weihua
    Xiong, Deyi
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1767 - 1776
  • [46] Multi-Granularity Self-Attention for Neural Machine Translation
    Hao, Jie
    Wang, Xing
    Shi, Shuming
    Zhang, Jinfeng
    Tu, Zhaopeng
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 887 - 897
  • [47] An Effective Coverage Approach for Attention-based Neural Machine Translation
    Hoang-Quan Nguyen
    Thuan-Minh Nguyen
    Huy-Hien Vu
    Van-Vinh Nguyen
    Phuong-Thai Nguyen
    Thi-Nga-My Dao
    Kieu-Hue Tran
    Khac-Quy Dinh
    [J]. PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 240 - 245
  • [48] History Attention for Source-Target Alignment in Neural Machine Translation
    Huang, Yan
    Chao, Wenhan
    Zhang, Peidong
    Yu, Yuanyuan
    [J]. PROCEEDINGS OF 2018 TENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2018, : 619 - 624
  • [49] Attending From Foresight: A Novel Attention Mechanism for Neural Machine Translation
    Li, Xintong
    Liu, Lemao
    Tu, Zhaopeng
    Li, Guanlin
    Shi, Shuming
    Meng, Max Q. -H.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2606 - 2616
  • [50] Cross Aggregation of Multi-head Attention for Neural Machine Translation
    Cao, Juncheng
    Zhao, Hai
    Yu, Kai
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING (NLPCC 2019), PT I, 2019, 11838 : 380 - 392