Empirical Regularization for Synthetic Sentence Pairs in Unsupervised Neural Machine Translation

被引:0
|
作者
Ai, Xi [1 ]
Fang, Bin [1 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
UNMT tackles translation on monolingual corpora in two required languages. Since there is no explicitly cross-lingual signal, pre-training and synthetic sentence pairs are significant to the success of UNMT. In this work, we empirically study the core training procedure of UNMT to analyze the synthetic sentence pairs obtained from back-translation. We introduce new losses to UNMT to regularize the synthetic sentence pairs by training the UNMT objective and the regularization objective jointly. Our comprehensive experiments support that our method can generally improve the performance of currently successful models on three similar pairs {French, German, Romanian} <-> English and one dissimilar pair Russian <-> English with acceptably additional cost.
引用
收藏
页码:12471 / 12479
页数:9
相关论文
共 50 条
  • [1] Unsupervised Neural Machine Translation for Similar and Distant Language Pairs: An Empirical Study
    Sun, Haipeng
    Wang, Rui
    Utiyama, Masao
    Marie, Benjamin
    Chen, Kehai
    Sumita, Eiichiro
    Zhao, Tiejun
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (01)
  • [2] Unsupervised Neural Machine Translation with SMT as Posterior Regularization
    Ren, Shuo
    Zhang, Zhirui
    Liu, Shujie
    Zhou, Ming
    Ma, Shuai
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 241 - 248
  • [3] Segmenting Long Sentence Pairs for Statistical Machine Translation
    Meng, Biping
    Huang, Shujian
    Dai, Xinyu
    Chen, Jiajun
    [J]. 2009 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2009, : 53 - 58
  • [4] Unsupervised dialectal neural machine translation
    Farhan, Wael
    Talafha, Bashar
    Abuammar, Analle
    Jaikat, Ruba
    Al-Ayyoub, Mahmoud
    Tarakji, Ahmad Bisher
    Toma, Anas
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
  • [5] Improving Neural Machine Translation with Neural Sentence Rewriting
    Wu, Tian
    He, Zhongjun
    Chen, Enhong
    Wang, Haifeng
    [J]. 2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 147 - 152
  • [6] Explicit Sentence Compression for Neural Machine Translation
    Li, Zuchao
    Wang, Rui
    Chen, Kehai
    Utiyama, Masao
    Sumita, Eiichiro
    Zhang, Zhuosheng
    Zhao, Hai
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8311 - 8318
  • [7] Long Sentence Preprocessing in Neural Machine Translation
    Ha Nguyen Tien
    Huyen Nguyen Thi Minh
    [J]. 2019 IEEE - RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF), 2019, : 301 - 306
  • [8] Effective Adversarial Regularization for Neural Machine Translation
    Sato, Motoki
    Suzuki, Jun
    Kiyono, Shun
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 204 - 210
  • [9] Unsupervised Domain Adaptation for Neural Machine Translation
    Yang, Zhen
    Chen, Wei
    Wang, Feng
    Xu, Bo
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 338 - 343
  • [10] Unsupervised Neural Machine Translation with Universal Grammar
    Li, Zuchao
    Utiyama, Masao
    Sumita, Eiichiro
    Zhao, Hai
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3249 - 3264