Curriculum pre-training for stylized neural machine translation

被引:0
|
作者
Zou, Aixiao [1 ]
Wu, Xuanxuan [2 ]
Li, Xinjie [3 ]
Zhang, Ting [3 ]
Cui, Fuwei [4 ]
Xu, Jinan [2 ]
机构
[1] Beijing Wuzi Univ, Sch Informat, Beijing 101149, Peoples R China
[2] Beijing Jiaotong Univ, Sch Comp Informat Technol, 3 Shangyuan Rd, Haidian Beijing 100044, Peoples R China
[3] Global Tone Commun Technol Co Ltd, 20 Shijingshan Rd, Beijing 100131, Peoples R China
[4] Chinese Acad Sci, Inst Automat, 95 East Zhongguancun Rd, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Stylized neural machine translation; Pre-training model; Data augmentation; Curriculum learning;
D O I
10.1007/s10489-024-05586-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stylized neural machine translation (NMT) aims to translate sentences of one style into sentences of another style, it is essential for the application of machine translation in a real-world scenario. Most existing methods employ an encoder-decoder structure to understand, translate, and transform style simultaneously, which increases the learning difficulty of the model and leads to poor generalization ability. To address these issues, we propose a curriculum pre-training framework to improve stylized NMT. Specifically, we design four pre-training tasks of increasing difficulty to assist the model to extract more features essential for stylized translation. Then, we further propose a stylized-token aligned data augmentation method to expand the scale of pre-training corpus for alleviating the data-scarcity problem. Experiments show that our method achieves competitive results on MTFC and Modern-Classical translation dataset.
引用
收藏
页码:7958 / 7968
页数:11
相关论文
共 50 条
  • [31] Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?
    Tamura, Hiroto
    Hirasawa, Tosho
    Kim, Hwichan
    Komachi, Mamoru
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2216 - 2225
  • [32] Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition
    Li, Zhen
    Qu, Dan
    Xie, Chaojie
    Zhang, Wenlin
    Li, Yanxia
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (7-8)
  • [33] Reinforced Curriculum Learning on Pre-Trained Neural Machine Translation Models
    Zhao, Mingjun
    Wu, Haijiang
    Niu, Di
    Wang, Xiaoli
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9652 - 9659
  • [34] Pre-training on dynamic graph neural networks
    Chen, Ke-Jia
    Zhang, Jiajun
    Jiang, Linpu
    Wang, Yunyun
    Dai, Yuxuan
    [J]. NEUROCOMPUTING, 2022, 500 : 679 - 687
  • [35] Neural speech enhancement with unsupervised pre-training and mixture training
    Hao, Xiang
    Xu, Chenglin
    Xie, Lei
    [J]. NEURAL NETWORKS, 2023, 158 : 216 - 227
  • [36] Improving Stylized Neural Machine Translation with Iterative Dual Knowledge Transfer
    Wu, Xuanxuan
    Liu, Jian
    Li, Xinjie
    Xu, Jinan
    Chen, Yufeng
    Zhang, Yujie
    Huang, Hui
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3971 - 3977
  • [37] Unsupervised Pre-training for Fully Convolutional Neural Networks
    Wiehman, Stiaan
    Kroon, Steve
    de Villiers, Hendrik
    [J]. 2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2016,
  • [38] Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training
    Song, Yuqing
    Chen, Shizhe
    Jin, Qin
    Luo, Wei
    Xie, Jun
    Huang, Fei
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2843 - 2852
  • [39] Graph Neural Pre-training for Recommendation with Side Information
    Liu, Siwei
    Meng, Zaiqiao
    Macdonald, Craig
    Ounis, Iadh
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (03)
  • [40] Matrix product state pre-training for quantum machine learning
    Dborin, James
    Barratt, Fergus
    Wimalaweera, Vinul
    Wright, Lewis
    Green, Andrew G.
    [J]. QUANTUM SCIENCE AND TECHNOLOGY, 2022, 7 (03):