A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

被引:1
|
作者
Vazquez, Raul [1 ]
Raganato, Alessandro [1 ]
Creutz, Mathias [1 ]
Tiedemann, Jorg [1 ]
机构
[1] Univ Helsinki, Dept Digital Humanities, Helsinki, Finland
基金
芬兰科学院; 欧洲研究理事会;
关键词
Computational linguistics - Semantics - Benchmarking - Classification (of information) - Computer aided language translation;
D O I
10.1162/coli_a_00377
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural machine translation has considerably improved the quality of automatic translations by learning good representations of input sentences. In this article, we explore a multilingual translation model capable of producing fixed-size sentence representations by incorporating an intermediate crosslingual shared layer, which we refer to as attention bridge. This layer exploits the semantics from each language and develops into a language-agnostic meaning representation that can be efficiently used for transfer learning. We systematically study the impact of the size of the attention bridge and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that there is no conflict between translation performance and the use of sentence representations in downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. Nevertheless, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. Similarly, we show that trainable downstream tasks benefit from multilingual models, whereas additional language signals do not improve performance in non-trainable benchmarks. This is an important insight that helps to properly design models for specific applications. Finally, we also include an in-depth analysis of the proposed attention bridge and its ability to encode linguistic properties. We carefully analyze the information that is captured by individual attention heads and identify interesting patterns that explain the performance of specific settings in linguistic probing tasks.
引用
收藏
页码:387 / 424
页数:38
相关论文
共 50 条
  • [41] Improving neural machine translation with sentence alignment learning
    Shi, Xuewen
    Huang, Heyan
    Jian, Ping
    Tang, Yi-Kun
    NEUROCOMPUTING, 2021, 420 : 15 - 26
  • [42] Automatic Long Sentence Segmentation for Neural Machine Translation
    Kuang, Shaohui
    Xiong, Deyi
    NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 162 - 174
  • [43] Sentence-Level Agreement for Neural Machine Translation
    Yang, Mingming
    Wang, Rui
    Chen, Kehai
    Utiyama, Masao
    Sumita, Eiichiro
    Zhang, Min
    Zhao, Tiejun
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3076 - 3082
  • [44] SENTENCE BOUNDARY AUGMENTATION FOR NEURAL MACHINE TRANSLATION ROBUSTNESS
    Li, Daniel
    Te, I
    Arivazhagan, Naveen
    Cherry, Colin
    Padfield, Dirk
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7553 - 7557
  • [45] Impact of Sentence Representation Matching in Neural Machine Translation
    Jung, Heeseung
    Kim, Kangil
    Shin, Jong-Hun
    Na, Seung-Hoon
    Jung, Sangkeun
    Woo, Sangmin
    APPLIED SCIENCES-BASEL, 2022, 12 (03):
  • [46] One Sentence One Model for Neural Machine Translation
    Li, Xiaoqing
    Zhang, Jiajun
    Zong, Chengqing
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 910 - 917
  • [47] Sentence Embedding for Neural Machine Translation Domain Adaptation
    Wang, Rui
    Finch, Andrew
    Utiyama, Masao
    Sumita, Eiichiro
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 560 - 566
  • [48] Attention-via-Attention Neural Machine Translation
    Zhao, Shenjian
    Zhang, Zhihua
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 563 - 570
  • [49] Exploiting Deep Representations for Neural Machine Translation
    Dou, Zi-Yi
    Tu, Zhaopeng
    Wang, Xing
    Shi, Shuming
    Zhang, Tong
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4253 - 4262
  • [50] Sentence Level Human Translation Quality Estimation with Attention-based Neural Networks
    Yuan, Yu
    Sharoff, Serge
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1858 - 1865