Migration Learning and Multi-View Training for Low-Resource Machine Translation

被引：0

作者：

Yan, Jing ^{[1
]}

Lin, Tao ^{[2
]}

Zhao, Shuai ^{[3
]}

机构：

[1] Jiaozuo Univ, Dept Basic Courses, Jiaozuo, Peoples R China

[2] Jiaozuo Univ, Sch Continuing Educ, Jiaozuo, Peoples R China

[3] Jiaozuo Univ, Sch Artificial Intelligence, Jiaozuo, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS | 2024年 / 15卷 / 05期

关键词：

Low-resource machine translation; migration learning; continual pretraining; multidimensional linguistic feature integration; multi-view training;

D O I：

10.14569/IJACSA.2024.0150572

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper discusses the main challenges and solution strategies of low-resource machine translation, and proposes a novel translation method combining migration learning and multi-view training. In a low-resource environment, neural machine translation models are prone to problems such as insufficient generalization performance, inaccurate translation of long sentences, difficulty in processing unregistered words, and inaccurate translation of domain-specific terms due to their heavy reliance on massively parallel corpora. Migration learning gradually adapts to the translation tasks of low-resource languages in the process of fine-tuning by borrowing the general translation knowledge of high-resource languages and utilizing pre-training models such as BERT, XLM-R, and so on. Multiperspective training, on the other hand, emphasizes the integration of source and target language features from multiple levels, such as word level, syntax and semantics, in order to enhance the model's comprehension and translation ability under limited data conditions. In the experiments, the study designed an experimental scheme containing pre-training model selection, multi-perspective feature construction, and migration learning and multi-perspective fusion, and compared the performance with randomly initialized Transformer model, pre-training-only model, and traditional statistical machine translation model. The experiments demonstrate that the model with multi-view training strategy significantly outperforms the baseline model in evaluation metrics such as BLEU, TER, and ChrF, and exhibits stronger robustness and accuracy in processing complex language structures and domain-specific terminology.

引用

页码：719 / 728

页数：10

共 50 条

[1] Enhancing Low-Resource Relation Representations through Multi-View Decoupling
Fan, Chenghao
Wei, Wei
Qu, Xiaoye
Lu, Zhenyi
Xie, Wenfeng
Cheng, Yu
Chen, Dangyang
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17968 - 17976
[2] Meta-Learning for Low-Resource Neural Machine Translation
Gu, Jiatao
Wang, Yong
Chen, Yun
Cho, Kyunghyun
Li, Victor O. K.
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3622 - 3631
[3] Survey of Low-Resource Machine Translation
Haddow, Barry
Bawden, Rachel
Barone, Antonio Valerio Miceli
Helcl, Jindrich
Birch, Alexandra
COMPUTATIONAL LINGUISTICS, 2022, 48 (03) : 673 - 732
[4] Simulated Multiple Reference Training Improves Low-Resource Machine Translation
Khayrallah, Huda
Thompson, Brian
Post, Matt
Koehn, Philipp
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 82 - 89
[5] Translation Memories as Baselines for Low-Resource Machine Translation
Knowles, Rebecca
Littell, Patrick
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6759 - 6767
[6] Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
Mao, Zhuoyuan
Chu, Chenhui
Kurohashi, Sadao
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (04)
[7] Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation
Luo, Gongxu
Yang, Yating
Yuan, Yang
Chen, Zhanheng
Ainiwaer, Aizimaiti
IEEE ACCESS, 2019, 7 : 154157 - 154166
[8] Review of Hierarchical Transfer Learning Architecture in Low-Resource Machine Translation
Yazar, Bilge Kagan
Kilic, Erdal
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
[9] Machine Translation into Low-resource Language Varieties
Kumar, Sachin
Anastasopoulos, Antonios
Wintner, Shuly
Tsvetkov, Yulia
ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 110 - 121
[10] A Survey on Low-Resource Neural Machine Translation
Wang, Rui
Tan, Xu
Luo, Renqian
Qin, Tao
Liu, Tie-Yan
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4636 - 4643

← 1 2 3 4 5 →