Bridging the Domain Gap: Improve Informal Language Translation via Counterfactual Domain Adaptation

被引:0
|
作者
Wang, Ke [1 ,2 ]
Chen, Guandan [3 ]
Huang, Zhongqiang [3 ]
Wan, Xiaojun [1 ,2 ]
Huang, Fei [3 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[2] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[3] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the near-human performances already achieved on formal texts such as news articles, neural machine translation still has difficulty in dealing with "user-generated" texts that have diverse linguistic phenomena but lack large-scale high-quality parallel corpora. To address this problem, we propose a counterfactual domain adaptation method to better leverage both large-scale source-domain data (formal texts) and small-scale target-domain data (informal texts). Specifically, by considering effective counterfactual conditions (the concatenations of source-domain texts and the target-domain tag), we construct the counterfactual representations to fill the sparse latent space of the target domain caused by a small amount of data, that is, bridging the gap between the source-domain data and the target-domain data. Experiments on English-to-Chinese and Chinese-to-English translation tasks show that our method outperforms the base model that is trained only on the informal corpus by a large margin, and consistently surpasses different baseline methods by +1.12 similar to 4.34 BLEU points on different datasets. Furthermore, we also show that our method achieves competitive performances on cross-domain language translation on four language pairs.
引用
收藏
页码:13970 / 13978
页数:9
相关论文
共 50 条
  • [31] Domain adaptation with clustered language models
    Ueberla, JP
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 807 - 810
  • [32] Unsupervised Domain Adaptation for Neural Machine Translation
    Yang, Zhen
    Chen, Wei
    Wang, Feng
    Xu, Bo
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 338 - 343
  • [33] A Domain Adaptation Method for Neural Machine Translation
    Tian, Xiaohu
    Liu, Jin
    Pu, Jiachen
    Wang, Jin
    ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING, MUE/FUTURETECH 2018, 2019, 518 : 321 - 326
  • [34] Bridging the Domain Gap Towards Generalization in Automatic Colorization
    Lee, Hyejin
    Kim, Daehee
    Lee, Daeun
    Kim, Jinkyu
    Lee, Jaekoo
    COMPUTER VISION - ECCV 2022, PT XVII, 2022, 13677 : 527 - 543
  • [35] Towards bridging the gap between domain and application design
    Derras, Mustapha
    Deruelle, Laurent
    Levy, Nicole
    Losavio, Francisca
    2018 SIXTH INTERNATIONAL CONFERENCE ON ENTERPRISE SYSTEMS (ES 2018), 2018, : 44 - 49
  • [36] Bridging the Day and Night Domain Gap for Semantic Segmentation
    Romera, Eduardo
    Bergasa, Luis M.
    Yang, Kailun
    Alvarez, Jose M.
    Barea, Rafael
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 1312 - 1318
  • [37] Bridging the Domain Gap for Multi-Agent Perception
    Xu, Runsheng
    Li, Jinlong
    Dong, Xiaoyu
    Yu, Hongkai
    Ma, Jiaqi
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6035 - 6042
  • [38] Bridging the Gap: Improving Domain Generalization in Trajectory Prediction
    Wang, Zhibo
    Guo, Jiayu
    Zhang, Haiqiang
    Wan, Ru
    Zhang, Junping
    Pu, Jian
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1780 - 1791
  • [39] Unsupervised Domain Adaptation via Domain-Adaptive Diffusion
    Peng, Duo
    Ke, Qiuhong
    Ambikapathi, ArulMurugan
    Yazici, Yasin
    Lei, Yinjie
    Liu, Jun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4245 - 4260
  • [40] Towards Unsupervised Domain Adaptation via Domain-Transformer
    Ren, Chuan-Xian
    Zhai, Yiming
    Luo, You-Wei
    Yan, Hong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (12) : 6163 - 6183