Migration Learning and Multi-View Training for Low-Resource Machine Translation

被引:0
|
作者
Yan, Jing [1 ]
Lin, Tao [2 ]
Zhao, Shuai [3 ]
机构
[1] Jiaozuo Univ, Dept Basic Courses, Jiaozuo, Peoples R China
[2] Jiaozuo Univ, Sch Continuing Educ, Jiaozuo, Peoples R China
[3] Jiaozuo Univ, Sch Artificial Intelligence, Jiaozuo, Peoples R China
关键词
Low-resource machine translation; migration learning; continual pretraining; multidimensional linguistic feature integration; multi-view training;
D O I
10.14569/IJACSA.2024.0150572
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper discusses the main challenges and solution strategies of low-resource machine translation, and proposes a novel translation method combining migration learning and multi-view training. In a low-resource environment, neural machine translation models are prone to problems such as insufficient generalization performance, inaccurate translation of long sentences, difficulty in processing unregistered words, and inaccurate translation of domain-specific terms due to their heavy reliance on massively parallel corpora. Migration learning gradually adapts to the translation tasks of low-resource languages in the process of fine-tuning by borrowing the general translation knowledge of high-resource languages and utilizing pre-training models such as BERT, XLM-R, and so on. Multiperspective training, on the other hand, emphasizes the integration of source and target language features from multiple levels, such as word level, syntax and semantics, in order to enhance the model's comprehension and translation ability under limited data conditions. In the experiments, the study designed an experimental scheme containing pre-training model selection, multi-perspective feature construction, and migration learning and multi-perspective fusion, and compared the performance with randomly initialized Transformer model, pre-training-only model, and traditional statistical machine translation model. The experiments demonstrate that the model with multi-view training strategy significantly outperforms the baseline model in evaluation metrics such as BLEU, TER, and ChrF, and exhibits stronger robustness and accuracy in processing complex language structures and domain-specific terminology.
引用
收藏
页码:719 / 728
页数:10
相关论文
共 50 条
  • [1] Enhancing Low-Resource Relation Representations through Multi-View Decoupling
    Fan, Chenghao
    Wei, Wei
    Qu, Xiaoye
    Lu, Zhenyi
    Xie, Wenfeng
    Cheng, Yu
    Chen, Dangyang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17968 - 17976
  • [2] Meta-Learning for Low-Resource Neural Machine Translation
    Gu, Jiatao
    Wang, Yong
    Chen, Yun
    Cho, Kyunghyun
    Li, Victor O. K.
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3622 - 3631
  • [3] Survey of Low-Resource Machine Translation
    Haddow, Barry
    Bawden, Rachel
    Barone, Antonio Valerio Miceli
    Helcl, Jindrich
    Birch, Alexandra
    COMPUTATIONAL LINGUISTICS, 2022, 48 (03) : 673 - 732
  • [4] Simulated Multiple Reference Training Improves Low-Resource Machine Translation
    Khayrallah, Huda
    Thompson, Brian
    Post, Matt
    Koehn, Philipp
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 82 - 89
  • [5] Translation Memories as Baselines for Low-Resource Machine Translation
    Knowles, Rebecca
    Littell, Patrick
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6759 - 6767
  • [6] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
    Mao, Zhuoyuan
    Chu, Chenhui
    Kurohashi, Sadao
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
  • [7] Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation
    Luo, Gongxu
    Yang, Yating
    Yuan, Yang
    Chen, Zhanheng
    Ainiwaer, Aizimaiti
    IEEE ACCESS, 2019, 7 : 154157 - 154166
  • [8] Review of Hierarchical Transfer Learning Architecture in Low-Resource Machine Translation
    Yazar, Bilge Kagan
    Kilic, Erdal
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [9] Machine Translation into Low-resource Language Varieties
    Kumar, Sachin
    Anastasopoulos, Antonios
    Wintner, Shuly
    Tsvetkov, Yulia
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 110 - 121
  • [10] A Survey on Low-Resource Neural Machine Translation
    Wang, Rui
    Tan, Xu
    Luo, Renqian
    Qin, Tao
    Liu, Tie-Yan
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4636 - 4643