Sparse and Constrained Attention for Neural Machine Translation

被引:0
|
作者
Malaviya, Chaitanya [1 ,3 ]
Ferreira, Pedro [2 ]
Martins, Andre F. T. [3 ,4 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Univ Lisbon, Inst Super Tecn, Lisbon, Portugal
[3] Unbabel, Lisbon, Portugal
[4] Inst Telecomunicacoes, Lisbon, Portugal
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In NMT, words are sometimes dropped from the source or generated repeatedly in the translation. We explore novel strategies to address the coverage problem that change only the attention transformation. Our approach allocates fertilities to source words, used to bound the attention each word can receive. We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be differentiable and sparse. Empirical evaluation is provided in three languages pairs.
引用
收藏
页码:370 / 376
页数:7
相关论文
共 50 条
  • [31] Neural Machine Translation With GRU-Gated Attention Model
    Zhang, Biao
    Xiong, Deyi
    Xie, Jun
    Su, Jinsong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4688 - 4698
  • [32] Machine Translation for Indian Languages Utilizing Recurrent Neural Networks and Attention
    Sharma, Sonali
    Diwakar, Manoj
    [J]. DISTRIBUTED COMPUTING AND OPTIMIZATION TECHNIQUES, ICDCOT 2021, 2022, 903 : 593 - 602
  • [33] Hybrid Attention for Chinese Character-Level Neural Machine Translation
    Wang, Feng
    Chen, Wei
    Yang, Zhen
    Xu, Shuang
    Xu, Bo
    [J]. NEUROCOMPUTING, 2019, 358 : 44 - 52
  • [34] Neural Machine Translation Models with Attention-Based Dropout Layer
    Israr, Huma
    Khan, Safdar Abbas
    Tahir, Muhammad Ali
    Shahzad, Muhammad Khuram
    Ahmad, Muneer
    Zain, Jasni Mohamad
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 2981 - 3009
  • [35] Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
    Kuang, Shaohui
    Li, Junhui
    Branco, Antonio
    Luo, Weihua
    Xiong, Deyi
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1767 - 1776
  • [36] Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine Translation
    Behnke, Maximiliana
    Heafield, Kenneth
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 2664 - 2674
  • [37] Multi-Granularity Self-Attention for Neural Machine Translation
    Hao, Jie
    Wang, Xing
    Shi, Shuming
    Zhang, Jinfeng
    Tu, Zhaopeng
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 887 - 897
  • [38] Neural Machine Translation with Attention Based on a New Syntactic Branch Distance
    Peng, Ru
    Chen, Zhitao
    Hao, Tianyong
    Fang, Yi
    [J]. MACHINE TRANSLATION, CCMT 2019, 2019, 1104 : 47 - 57
  • [39] An Effective Coverage Approach for Attention-based Neural Machine Translation
    Hoang-Quan Nguyen
    Thuan-Minh Nguyen
    Huy-Hien Vu
    Van-Vinh Nguyen
    Phuong-Thai Nguyen
    Thi-Nga-My Dao
    Kieu-Hue Tran
    Khac-Quy Dinh
    [J]. PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 240 - 245
  • [40] Attending From Foresight: A Novel Attention Mechanism for Neural Machine Translation
    Li, Xintong
    Liu, Lemao
    Tu, Zhaopeng
    Li, Guanlin
    Shi, Shuming
    Meng, Max Q. -H.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2606 - 2616