LenM: Improving Low-Resource Neural Machine Translation Using Target Length Modeling

被引：5

作者：

Mahsuli, Mohammad Mahdi ^{[1
]}

Khadivi, Shahram ^{[1
]}

Homayounpour, Mohammad Mehdi ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Dept Comp Engn, Tehran Polytech, Tehran, Iran

来源：

NEURAL PROCESSING LETTERS | 2023年 / 55卷 / 07期

关键词：

Deep learning; Natural language processing; Neural machine translation; Recurrent neural network; Sequence-to-sequence mapping; Sequence length modeling; Multiplicative residual connection;

D O I：

10.1007/s11063-023-11208-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural machine translation (NMT) is a hot field in artificial intelligence which aims at translating a text from a source language into a different target language. Although NMT systems perform quite well in high-resource setup, but their performance for low-resource data is low. One aspect of data scarcity is the lack of diversity in the sentence length of training data. Also, since we usually set a maximum sentence length during training, we observe degeneration in the translation of sentences longer than the max length. In this paper, we propose LenM-a method to model the length of a target (translated) sentence given the source sentence using a deep recurrent neural structure-and apply it to the decoder side of neural machine translation systems to generate translation sentences with appropriate lengths which have a better quality. Our proposed method helps to fix some drawbacks of NMT like output degradation on unseen sentence lengths, and the limitation of using larger beam sizes in the decoding phase of translation. This method can be applied to any NMT model regardless of the structure and does not slow down the translation speed. Moreover, it can be used efficiently in non-autoregressive machine translation systems which need to know the target length before decoding. The final outcome of this paper is improving the output quality of neural machine translation systems when trained on low-resource corpora. Our experiments show the superior performance of the proposed method compared to the state-of-the-art neural machine translation systems when facing target length mismatch in training and inference, with up to 9.82 BLEU points improvement for German-to-English translation and up to 6.28 BLEU points improvement for Arabic-to-English translation.

引用

页码：9435 / 9466

页数：32

共 50 条

[1] LenM: Improving Low-Resource Neural Machine Translation Using Target Length Modeling
Mohammad Mahdi Mahsuli
Shahram Khadivi
Mohammad Mehdi Homayounpour
[J]. Neural Processing Letters, 2023, 55 : 9435 - 9466
[2] A Survey on Low-Resource Neural Machine Translation
Wang, Rui
Tan, Xu
Luo, Renqian
Qin, Tao
Liu, Tie-Yan
[J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4636 - 4643
[3] A Survey on Low-resource Neural Machine Translation
Li, Hong-Zheng
Feng, Chong
Huang, He-Yan
[J]. Zidonghua Xuebao/Acta Automatica Sinica, 2021, 47 (06): : 1217 - 1231
[4] Transformers for Low-resource Neural Machine Translation
Gezmu, Andargachew Mekonnen
Nuernberger, Andreas
[J]. ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2022, : 459 - 466
[5] Decoding Strategies for Improving Low-Resource Machine Translation
Park, Chanjun
Yang, Yeongwook
Park, Kinam
Lim, Heuiseok
[J]. ELECTRONICS, 2020, 9 (10) : 1 - 15
[6] Low-Resource Neural Machine Translation with Neural Episodic Control
Wu, Nier
Hou, Hongxu
Sun, Shuo
Zheng, Wei
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[7] Low-resource Neural Machine Translation: Methods and Trends
Shi, Shumin
Wu, Xing
Su, Rihai
Huang, Heyan
[J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
[8] Recent advances of low-resource neural machine translation
Haque, Rejwanul
Liu, Chao-Hong
Way, Andy
[J]. MACHINE TRANSLATION, 2021, 35 (04) : 451 - 474
[9] Neural Machine Translation for Low-resource Languages: A Survey
Ranathunga, Surangika
Lee, En-Shiun Annie
Skenduli, Marjana Prifti
Shekhar, Ravi
Alam, Mehreen
Kaur, Rishemjit
[J]. ACM COMPUTING SURVEYS, 2023, 55 (11)
[10] Data Augmentation for Low-Resource Neural Machine Translation
Fadaee, Marzieh
Bisazza, Arianna
Monz, Christof
[J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 567 - 573

← 1 2 3 4 5 →