Attention-via-Attention Neural Machine Translation

被引:0
|
作者
Zhao, Shenjian [1 ]
Zhang, Zhihua [2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Peking Univ, Beijing Inst Big Data Res, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since many languages originated from a common ancestral language and influence each other, there would inevitably exist similarities between these languages such as lexical similarity and named entity similarity. In this paper, we leverage these similarities to improve the translation performance in neural machine translation. Specifically, we introduce an attention-via-attention mechanism that allows the information of source-side characters flowing to the target side directly. With this mechanism, the target-side characters will be generated based on the representation of source-side characters when the words are similar. For instance, our proposed neural machine translation system learns to transfer the character level information of the English word 'system' through the attention-via-attention mechanism to generate the Czech word 'system'. Consequently, our approach is able to not only achieve a competitive translation performance, but also reduce the model size significantly.
引用
收藏
页码:563 / 570
页数:8
相关论文
共 50 条
  • [21] Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation
    Moradi, Pooya
    Kambhatla, Nishant
    Sarkar, Anoop
    [J]. AACL-IJCNLP 2020: THE 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2020, : 86 - 93
  • [22] A Visual Attention Grounding Neural Model for Multimodal Machine Translation
    Zhou, Mingyang
    Cheng, Runxiang
    Lee, Yong Jae
    Yu, Zhou
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3643 - 3653
  • [23] Towards Understanding Neural Machine Translation with Attention Heads' Importance
    Zhou, Zijie
    Zhu, Junguo
    Li, Weijiang
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [24] Selective Attention for Context-aware Neural Machine Translation
    Maruf, Sameen
    Martins, Andre F. T.
    Haffari, Gholamreza
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3092 - 3102
  • [25] Look-Ahead Attention for Generation in Neural Machine Translation
    Zhou, Long
    Zhang, Jiajun
    Zong, Chengqing
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 211 - 223
  • [26] Syntax-Based Attention Masking for Neural Machine Translation
    McDonald, Colin
    Chiang, David
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 47 - 52
  • [27] Neural Machine Translation With GRU-Gated Attention Model
    Zhang, Biao
    Xiong, Deyi
    Xie, Jun
    Su, Jinsong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
  • [28] Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation
    Lin, Junyang
    Sun, Xu
    Ren, Xuancheng
    Li, Muyu
    Su, Qi
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2985 - 2990
  • [29] Hybrid Attention for Chinese Character-Level Neural Machine Translation
    Wang, Feng
    Chen, Wei
    Yang, Zhen
    Xu, Shuang
    Xu, Bo
    [J]. NEUROCOMPUTING, 2019, 358 : 44 - 52
  • [30] Machine Translation for Indian Languages Utilizing Recurrent Neural Networks and Attention
    Sharma, Sonali
    Diwakar, Manoj
    [J]. DISTRIBUTED COMPUTING AND OPTIMIZATION TECHNIQUES, ICDCOT 2021, 2022, 903 : 593 - 602