A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

被引:1
|
作者
Vazquez, Raul [1 ]
Raganato, Alessandro [1 ]
Creutz, Mathias [1 ]
Tiedemann, Jorg [1 ]
机构
[1] Univ Helsinki, Dept Digital Humanities, Helsinki, Finland
基金
芬兰科学院; 欧洲研究理事会;
关键词
Computational linguistics - Semantics - Benchmarking - Classification (of information) - Computer aided language translation;
D O I
10.1162/coli_a_00377
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural machine translation has considerably improved the quality of automatic translations by learning good representations of input sentences. In this article, we explore a multilingual translation model capable of producing fixed-size sentence representations by incorporating an intermediate crosslingual shared layer, which we refer to as attention bridge. This layer exploits the semantics from each language and develops into a language-agnostic meaning representation that can be efficiently used for transfer learning. We systematically study the impact of the size of the attention bridge and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that there is no conflict between translation performance and the use of sentence representations in downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. Nevertheless, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. Similarly, we show that trainable downstream tasks benefit from multilingual models, whereas additional language signals do not improve performance in non-trainable benchmarks. This is an important insight that helps to properly design models for specific applications. Finally, we also include an in-depth analysis of the proposed attention bridge and its ability to encode linguistic properties. We carefully analyze the information that is captured by individual attention heads and identify interesting patterns that explain the performance of specific settings in linguistic probing tasks.
引用
收藏
页码:387 / 424
页数:38
相关论文
共 50 条
  • [1] An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation
    Raganato, Alessandro
    Vazquez, Raul
    Creutz, Mathias
    Tiedemann, Jorg
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 27 - 32
  • [2] Multilingual Agreement for Multilingual Neural Machine Translation
    Yang, Jian
    Yin, Yuwei
    Ma, Shuming
    Huang, Haoyang
    Zhang, Dongdong
    Li, Zhoujun
    Wei, Furu
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 233 - 239
  • [3] Parameter Differentiation Based Multilingual Neural Machine Translation
    Wang, Qian
    Zhang, Jiajun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11440 - 11448
  • [4] Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation
    Zhang, Pei
    Zhang, Xu
    Chen, Wei
    Yu, Jian
    Wang, Yanfeng
    Xiong, Deyi
    ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2298 - 2305
  • [5] A comparative study of machine translation for multilingual sentence-level sentiment analysis
    Araujo, Matheus
    Pereira, Adriano
    Benevenuto, Fabricio
    INFORMATION SCIENCES, 2020, 512 : 1078 - 1102
  • [6] A Survey of Multilingual Neural Machine Translation
    Dabre, Raj
    Chu, Chenhui
    Kunchukuttan, Anoop
    ACM COMPUTING SURVEYS, 2020, 53 (05)
  • [7] Massively Multilingual Neural Machine Translation
    Aharoni, Roee
    Johnson, Melvin
    Firat, Orhan
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3874 - 3884
  • [8] Multilingual Simultaneous Neural Machine Translation
    Arthur, Philip
    Ryu, Dongwon K.
    Haffari, Gholamreza
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4758 - 4766
  • [9] Bilingual attention based neural machine translation
    Kang, Liyan
    He, Shaojie
    Wang, Mingxuan
    Long, Fei
    Su, Jinsong
    APPLIED INTELLIGENCE, 2023, 53 (04) : 4302 - 4315
  • [10] Bilingual attention based neural machine translation
    Liyan Kang
    Shaojie He
    Mingxuan Wang
    Fei Long
    Jinsong Su
    Applied Intelligence, 2023, 53 : 4302 - 4315