Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation

被引:0
|
作者
Ai, Xi [1 ]
Fang, Bin [1 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
UNMT tackles translation on monolingual corpora in two required languages. Since there is no explicitly cross-lingual signal, pre-training and synthetic sentence pairs are significant to the success of UNMT. In this work, we empirically study the core training procedure of UNMT to analyze the synthetic sentence pairs obtained from back-translation. We introduce new losses to UNMT to regularize the synthetic sentence pairs by training the UNMT objective and the regularization objective jointly. Our comprehensive experiments support that our method can generally improve the performance of currently successful models on three similar pairs {French, German, Romanian} <-> English and one dissimilar pair Russian <-> English with acceptably additional cost.
引用
收藏
页码:12471 / 12479
页数:9
相关论文
共 50 条
  • [31] Multilingual Unsupervised Neural Machine Translation with Denoising Adapters
    Ustun, Ahmet
    Berard, Alexandre
    Besacier, Laurent
    Galle, Matthias
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6650 - 6662
  • [32] Exploiting Curriculum Learning in Unsupervised Neural Machine Translation
    Lu, Jinliang
    Zhang, Jiajun
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 924 - 934
  • [33] Phrase-Based & Neural Unsupervised Machine Translation
    Lample, Guillaume
    Ott, Myle
    Conneau, Alexis
    Denoyer, Ludovic
    Ranzato, Marc'Aurelio
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 5039 - 5049
  • [34] Unsupervised Parallel Sentence Extraction with Parallel Segment Detection Helps Machine Translation
    Hangya, Viktor
    Fraser, Alexander
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1224 - 1234
  • [35] Neural Machine Translation With Sentence-Level Topic Context
    Chen, Kehai
    Wang, Rui
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Tiejun
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 1970 - 1984
  • [36] Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation
    Wang, Rui
    Utiyama, Masao
    Finch, Andrew
    Liu, Lemao
    Chen, Kehai
    Sumita, Eiichiro
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1727 - 1741
  • [37] Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation
    Kondo, Seiichiro
    Hotate, Kengo
    Hirasawa, Tosho
    Kaneko, Masahiro
    Komachi, Mamoru
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 143 - 149
  • [38] Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
    Wang, Rui
    Utiyama, Masao
    Sumita, Eiichiro
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 298 - 304
  • [39] Prediction Difference Regularization against Perturbation for Neural Machine Translation
    Guo, Dengji
    Ma, Zhengrui
    Zhang, Min
    Feng, Yang
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 7665 - 7675
  • [40] ReWE: RegressingWord Embeddings for Regularization of Neural Machine Translation Systems
    Unanue, Inigo Jauregi
    Borzeshi, Ehsan Zare
    Esmaili, Nazanin
    Piccardil, Massimo
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 430 - 436