Neural Machine Translation with Target-Attention Model

被引:6
|
作者
Yang, Mingming [1 ]
Zhang, Min [1 ,2 ]
Chen, Kehai [3 ]
Wang, Rui [3 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
[3] Natl Inst Informat & Commun Technol, Kyoto 6190289, Japan
基金
日本学术振兴会;
关键词
attention mechanism; neural machine translation; forward target-attention model; reverse target-attention model; bidirectional target-attention model;
D O I
10.1587/transinf.2019EDP7157
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attention mechanism, which selectively focuses on source-side information to learn a context vector for generating target words, has been shown to be an effective method for neural machine translation (NMT). In fact, generating target words depends on not only the source-side information but also the target-side information. Although the vanilla NMT can acquire target-side information implicitly by recurrent neural networks (RNN), RNN cannot adequately capture the global relationship between target-side words. To solve this problem, this paper proposes a novel target-attention approach to capture this information, thus enhancing target word predictions in NMT. Specifically, we propose three variants of target-attention model to directly obtain the global relationship among target words: 1) a forward target-attention model that uses a target attention mechanism to incorporate previous historical target words into the prediction of the current target word; 2) a reverse target-attention model that adopts a reverse RNN model to obtain the entire reverse target words information, and then to combine with source context information to generate target sequence; 3) a bidirectional target-attention model that combines the forward target-attention model and reverse target-attention model together, which can make full use of target words to further improve the performance of NMT. Our methods can be integrated into both RNN based NMT and self-attention based NMT, and help NMT get global target-side information to improve translation performance. Experiments on the NIST Chinese-to-English and the WMT English-to-German translation tasks show that the proposed models achieve significant improvements over state-of-the-art baselines.
引用
收藏
页码:684 / 694
页数:11
相关论文
共 50 条
  • [21] Attention based English to Punjabi neural machine translation
    Singh, Shivkaran
    Kumar, M. Anand
    Soman, K. P.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (03) : 1551 - 1559
  • [22] Measuring and Improving Faithfulness of Attention in Neural Machine Translation
    Moradi, Pooya
    Kambhatla, Nishant
    Sarkar, Anoop
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2791 - 2802
  • [23] Simultaneous neural machine translation with a reinforced attention mechanism
    Lee, YoHan
    Shin, JongHun
    Kim, YoungKil
    [J]. ETRI JOURNAL, 2021, 43 (05) : 775 - 786
  • [24] Synchronous Syntactic Attention for Transformer Neural Machine Translation
    Deguchi, Hiroyuki
    Tamura, Akihiro
    Ninomiya, Takashi
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 348 - 355
  • [25] Modeling Concentrated Cross-Attention for Neural Machine Translation with Gaussian Mixture Model
    Zhang, Shaolei
    Feng, Yang
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1401 - 1411
  • [26] Predicting and Using Target Length in Neural Machine Translation
    Yang, Zijian
    Gao, Yingbo
    Wang, Weiyue
    Ney, Hermann
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 389 - 395
  • [27] Bag-of-Words as Target for Neural Machine Translation
    Ma, Shuming
    Sun, Xu
    Wang, Yizhong
    Lin, Junyang
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 332 - 338
  • [28] Attention over Heads: A Multi-Hop Attention for Neural Machine Translation
    Iida, Shohei
    Kimura, Ryuichiro
    Cui, Hongyi
    Hung, Po-Hsuan
    Utsuro, Takehito
    Nagata, Masaaki
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 217 - 222
  • [29] Towards Understanding Neural Machine Translation with Attention Heads' Importance
    Zhou, Zijie
    Zhu, Junguo
    Li, Weijiang
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [30] Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation
    Moradi, Pooya
    Kambhatla, Nishant
    Sarkar, Anoop
    [J]. AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2020, : 86 - 93