Cross-lingual Supervision Improves Unsupervised Neural Machine Translation

被引:0
|
作者
Wang, Mingxuan [1 ]
Bai, Hongxiao [2 ]
Zhao, Hai [2 ]
Li, Lei [1 ]
机构
[1] ByteDance AI Lab, Beijing, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose to improve unsupervised neural machine translation with cross-lingual supervision (CUNMT), which utilizes supervision signals from high resource language pairs to improve the translation of zero-source languages. Specifically, for training En-Ro system without parallel corpus, we can leverage the corpus from En-Fr and En-De to collectively train the translation from one language into many languages under one model. Simple and effective, CUNMT significantly improves the translation quality with a big margin in the benchmark unsupervised translation tasks, and even achieves comparable performance to supervised NMT. In particular, on WMT'14 En-Fr tasks CUNMT achieves 37.6 and 35.18 BLEU score, which is very close to the large scale supervised setting and on WMT'16 EnRo tasks CUNMT achieves 35.09 BLEU score which is even better than the supervised Transformer baseline.
引用
收藏
页码:89 / 96
页数:8
相关论文
共 50 条
  • [1] Unsupervised Neural Machine Translation With Cross-Lingual Language Representation Agreement
    Sun, Haipeng
    Wang, Rui
    Chen, Kehai
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Tiejun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1170 - 1182
  • [2] Exploring Cross-Lingual Transfer Learning with Unsupervised Machine Translation
    Wang, Chao
    Gaspers, Judith
    Do, Quynh
    Jiang, Hui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2011 - 2020
  • [3] Unsupervised multilingual machine translation with pretrained cross-lingual encoders
    Shen, Yingli
    Bao, Wei
    Gao, Ge
    Zhou, Maoke
    Zhao, Xiaobing
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [4] Generalised Unsupervised Domain Adaptation of Neural Machine Translation with Cross-Lingual Data Selection
    Thuy-Trang Vu
    He, Xuanli
    Dinh Phung
    Haffari, Gholamreza
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3335 - 3346
  • [5] Data Augmentation with Unsupervised Machine Translation Improves the Structural Similarity of Cross-lingual Word Embeddings
    Nishikawa, Sosuke
    Ri, Ryokan
    Tsuruoka, Yoshimasa
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING: PROCEEDINGS OF THE STUDENT RESEARCH WORKSHOP, 2021, : 163 - 173
  • [6] Explicit Cross-lingual Pre-training for Unsupervised Machine Translation
    Ren, Shuo
    Wu, Yu
    Liu, Shujie
    Zhou, Ming
    Ma, Shuai
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 770 - 779
  • [8] Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation
    Siddhant, Aditya
    Johnson, Melvin
    Tsai, Henry
    Ari, Naveen
    Riesa, Jason
    Bapna, Ankur
    Firat, Orhan
    Raman, Karthik
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8854 - 8861
  • [9] Cross-Lingual Preposition Disambiguation for Machine Translation
    Kumar, M. Anand
    Rajendran, S.
    Soman, K. P.
    [J]. ELEVENTH INTERNATIONAL CONFERENCE ON COMMUNICATION NETWORKS, ICCN 2015/INDIA ELEVENTH INTERNATIONAL CONFERENCE ON DATA MINING AND WAREHOUSING, ICDMW 2015/NDIA ELEVENTH INTERNATIONAL CONFERENCE ON IMAGE AND SIGNAL PROCESSING, ICISP 2015, 2015, 54 : 291 - 300
  • [10] Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
    Kim, Yunsu
    Gao, Yingbo
    Ney, Hermann
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1246 - 1257