Contextual Parameter Generation for Universal Neural Machine Translation

被引:0
|
作者
Platanios, Emmanouil Antonios [1 ]
Sachan, Mrinmaya [1 ]
Neubig, Graham [2 ]
Mitchell, Tom M. [1 ]
机构
[1] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation. Our approach requires no changes to the model architecture of a standard NMT system, but instead introduces a new component, the contextual parameter generator (CPG), that generates the parameters of the system (e.g., weights in a neural network). This parameter generator accepts source and target language embeddings as input, and generates the parameters for the encoder and the decoder, respectively. The rest of the model remains unchanged and is shared across all languages. We show how this simple modification enables the system to use monolingual data for training and also perform zero-shot translation. We further show it is able to surpass state-of-theart performance for both the IWSLT-15 and IWSLT-17 datasets and that the learned language embeddings are able to uncover interesting relationships between languages.
引用
收藏
页码:425 / 435
页数:11
相关论文
共 50 条
  • [1] Dual contextual module for neural machine translation
    Ampomah, Isaac Kojo Essel
    McClean, Sally
    Hawe, Glenn
    [J]. MACHINE TRANSLATION, 2021, 35 (04) : 571 - 593
  • [2] Unsupervised Neural Machine Translation with Universal Grammar
    Li, Zuchao
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Hai
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3249 - 3264
  • [3] Soft Contextual Data Augmentation for Neural Machine Translation
    Gao, Fei
    Zhu, Jinhua
    Wu, Lijun
    Xia, Yingce
    Qin, Tao
    Cheng, Xueqi
    Zhou, Wengang
    Liu, Tie-Yan
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5539 - 5544
  • [4] Detecting Source Contextual Barriers for Understanding Neural Machine Translation
    Li, Guanlin
    Liu, Lemao
    Zhu, Conghui
    Wang, Rui
    Zhao, Tiejun
    Shi, Shuming
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3158 - 3169
  • [5] Parameter Differentiation Based Multilingual Neural Machine Translation
    Wang, Qian
    Zhang, Jiajun
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11440 - 11448
  • [6] Data and Parameter Scaling Laws for Neural Machine Translation
    Gordon, Mitchell
    Duh, Kevin
    Kaplan, Jared
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5915 - 5922
  • [7] Entity Highlight Generation as Statistical and Neural Machine Translation
    Huang, Jizhou
    Sun, Yaming
    Zhang, Wei
    Wang, Haifeng
    Liu, Ting
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1860 - 1872
  • [8] A Reinforced Generation of Adversarial Examples for Neural Machine Translation
    Zou, Wei
    Huang, Shujian
    Xie, Jun
    Dai, Xinyu
    Chen, Jiajun
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3486 - 3497
  • [9] Findings of the Second Workshop on Neural Machine Translation and Generation
    Birch, Alexandra
    Finch, Andrew
    Minh-Thang Luong
    Neubig, Graham
    Oda, Yusuke
    [J]. NEURAL MACHINE TRANSLATION AND GENERATION, 2018, : 1 - 10
  • [10] Neural Machine Translation with Phrase-Level Universal Visual Representations
    Fang, Qingkai
    Feng, Yang
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5687 - 5698