Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

被引:0
|
作者
Zhang, Biao [1 ]
Williams, Philip [1 ]
Titov, Ivan [1 ,2 ]
Sennrich, Rico [1 ,3 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh, Midlothian, Scotland
[2] Univ Amsterdam, ILLC, Amsterdam, Netherlands
[3] Univ Zurich, Dept Computat Linguist, Zurich, Switzerland
基金
瑞士国家科学基金会; 欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by similar to 10 BLEU, approaching conventional pivot-based methods.
引用
收藏
页码:1628 / 1639
页数:12
相关论文
共 50 条
  • [41] On the Pareto Front of Multilingual Neural Machine Translation
    Chen, Liang
    Ma, Shuming
    Zhang, Dongdong
    Wei, Furu
    Chang, Baobao
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [42] Neural Machine Translation Based on Back-Translation for Multilingual Translation Evaluation Task
    Lai, Siyu
    Yang, Yueting
    Xu, Jin'an
    Chen, Yufeng
    Huang, Hui
    [J]. MACHINE TRANSLATION, CCMT 2020, 2020, 1328 : 132 - 141
  • [43] Synchronous Inference for Multilingual Neural Machine Translation
    Wang, Qian
    Zhang, Jiajun
    Zong, Chengqing
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1827 - 1839
  • [44] Zero-Shot Commonsense Question Answering with Cloze Translation and Consistency Optimization
    Dou, Zi-Yi
    Peng, Nanyun
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10572 - 10580
  • [45] Zero-Shot Translation of Attention Patterns in VQA Models to Natural Language
    Salewski, Leonard
    Koepke, A. Sophia
    Lensch, Hendrik P. A.
    Akata, Zeynep
    [J]. PATTERN RECOGNITION, DAGM GCPR 2023, 2024, 14264 : 378 - 393
  • [46] Zero-Shot Information Extraction as a Unified Text-to-Triple Translation
    Wang, Chenguang
    Liu, Xiao
    Chen, Zui
    Hong, Haoyun
    Tang, Jie
    Song, Dawn
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1225 - 1238
  • [47] Translate & Fill: Improving Zero-Shot Multilingual Semantic Parsing with Synthetic Data
    Nicosia, Massimo
    Qu, Zhongdi
    Altun, Yasemin
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 3272 - 3284
  • [48] Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation
    Cheng, Yong
    Bapna, Ankur
    Firat, Orhan
    Cao, Yuan
    Wang, Pidong
    Macherey, Wolfgang
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4092 - 4102
  • [49] Open and Competitive Multilingual Neural Machine Translation in Production
    Tattar, Andre
    Purason, Taido
    Kuulmets, Hele-Andra
    Luhtaru, Agnes
    Ratsep, Liisa
    Tars, Maali
    Pinnis, Marcis
    Bergmanis, Toms
    Fishel, Mark
    [J]. BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (03): : 422 - 434
  • [50] Multi-way, multilingual neural machine translation
    Firat, Orhan
    Cho, Kyunghyun
    Sankaran, Baskaran
    Vural, Fatos T. Yarman
    Bengio, Yoshua
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 236 - 252