Multilingual Pre-training Model-Assisted Contrastive Learning Neural Machine Translation

被引:0
|
作者
Sun, Shuo [1 ]
Hou, Hong-xu [1 ]
Yang, Zong-heng [1 ]
Wang, Yi-song [1 ]
机构
[1] Inner Mongolia Univ, Coll Comp Sci, Natl & Local Joint Engn Res Ctr Intelligent Infor, Inner Mongolia Key Lab Mongolian Informat Proc Te, Hohhot, Peoples R China
关键词
Low-Resource NMT; Pre-training Model; Contrastive Learning; Dynamic Training;
D O I
10.1109/IJCNN54540.2023.10191766
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since pre-training and fine-tuning have been a successful paradigm in Natural Language Processing (NLP), this paper adopts the SOTA pre-training model-CeMAT as a strong assistant for low-resource ethnic language translation tasks. Aiming at the exposure bias problem in the fine-tuning process, we use the contrastive learning framework and propose a new contrastive examples generation method, which uses self-generated predictions as contrastive examples to expose the model to errors during inference. Moreover, in order to effectively utilize the limited bilingual data in low-resource tasks, this paper proposes a dynamic training strategy to fine-tune the model, and refines the model step by step by taking word embedding norm and uncertainty as the criteria of evaluate data and model respectively. Experimental results demonstrate that our method significantly improves the quality compared to the baselines, which fully verifies the effectiveness.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Adversarial momentum-contrastive pre-training
    Xu, Cong
    Li, Dan
    Yang, Min
    PATTERN RECOGNITION LETTERS, 2022, 160 : 172 - 179
  • [42] Discovering Representation Sprachbund For Multilingual Pre-Training
    Fan, Yimin
    Liang, Yaobo
    Muzio, Alexandre
    Hassan, Hany
    Li, Houqiang
    Zhou, Ming
    Duan, Nan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 881 - 894
  • [43] UserBERT: Pre-training User Model with Contrastive Self-supervision
    Wu, Chuhan
    Wu, Fangzhao
    Qi, Tao
    Huang, Yongfeng
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 2087 - 2092
  • [44] Contrastive Code-Comment Pre-training
    Pei, Xiaohuan
    Liu, Daochang
    Qian, Luo
    Xu, Chang
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 398 - 407
  • [45] Temporal Contrastive Pre-Training for Sequential Recommendation
    Tian, Changxin
    Lin, Zihan
    Bian, Shuqing
    Wang, Jinpeng
    Zhao, Wayne Xin
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1925 - 1934
  • [46] A Multilingual Framework Based on Pre-training Model for Speech Emotion Recognition
    Zhang, Zhaohang
    Zhang, Xiaohui
    Guo, Min
    Zhang, Wei-Qiang
    Li, Ke
    Huang, Yukai
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 750 - 755
  • [47] Multilingual Agreement for Multilingual Neural Machine Translation
    Yang, Jian
    Yin, Yuwei
    Ma, Shuming
    Huang, Haoyang
    Zhang, Dongdong
    Li, Zhoujun
    Wei, Furu
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 233 - 239
  • [48] Character-Aware Low-Resource Neural Machine Translation with Weight Sharing and Pre-training
    Cao, Yichao
    Li, Miao
    Feng, Tao
    Wang, Rujing
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 321 - 333
  • [49] XLIT: A Method to Bridge Task Discrepancy in Machine Translation Pre-training
    Pham, Khang
    Nguyen, Long
    Dinh, Dien
    ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23 (10)
  • [50] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
    Mao, Zhuoyuan
    Chu, Chenhui
    Kurohashi, Sadao
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)