Machine Translation in Low-Resource Languages by an Adversarial Neural Network

被引:1
|
作者
Sun, Mengtao [1 ]
Wang, Hao [2 ]
Pasquine, Mark [3 ]
Hameed, Ibrahim A. [1 ]
机构
[1] Norwegian Univ Sci & Technol, Dept ICT & Nat Sci, N-6009 Alesund, Norway
[2] Norwegian Univ Sci & Technol, Dept Comp Sci, N-2815 Gjovik, Norway
[3] Norwegian Univ Sci & Technol, Dept Int Business, N-6009 Alesund, Norway
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 22期
关键词
machine learning; adversarial machine learning; imbalanced datasets; transfer learning;
D O I
10.3390/app112210860
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Existing Sequence-to-Sequence (Seq2Seq) Neural Machine Translation (NMT) shows strong capability with High-Resource Languages (HRLs). However, this approach poses serious challenges when processing Low-Resource Languages (LRLs), because the model expression is limited by the training scale of parallel sentence pairs. This study utilizes adversary and transfer learning techniques to mitigate the lack of sentence pairs in LRL corpora. We propose a new Low resource, Adversarial, Cross-lingual (LAC) model for NMT. In terms of the adversary technique, LAC model consists of a generator and discriminator. The generator is a Seq2Seq model that produces the translations from source to target languages, while the discriminator measures the gap between machine and human translations. In addition, we introduce transfer learning on LAC model to help capture the features in rare resources because some languages share the same subject-verb-object grammatical structure. Rather than using the entire pretrained LAC model, we separately utilize the pretrained generator and discriminator. The pretrained discriminator exhibited better performance in all experiments. Experimental results demonstrate that the LAC model achieves higher Bilingual Evaluation Understudy (BLEU) scores and has good potential to augment LRL translations.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Neural Machine Translation for Low-resource Languages: A Survey
    Ranathunga, Surangika
    Lee, En-Shiun Annie
    Skenduli, Marjana Prifti
    Shekhar, Ravi
    Alam, Mehreen
    Kaur, Rishemjit
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (11)
  • [2] Extremely low-resource neural machine translation for Asian languages
    Rubino, Raphael
    Marie, Benjamin
    Dabre, Raj
    Fujita, Atushi
    Utiyama, Masao
    Sumita, Eiichiro
    [J]. MACHINE TRANSLATION, 2020, 34 (04) : 347 - 382
  • [3] Neural Machine Translation of Low-Resource and Similar Languages with Backtranslation
    Przystupa, Michael
    Abdul-Mageed, Muhammad
    [J]. FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), VOL 3: SHARED TASK PAPERS, DAY 2, 2019, : 224 - 235
  • [4] Benchmarking Neural and Statistical Machine Translation on Low-Resource African Languages
    Duh, Kevin
    McNamee, Paul
    Post, Matt
    Thompson, Brian
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2667 - 2675
  • [5] An Analysis of Massively Multilingual Neural Machine Translation for Low-Resource Languages
    Mueller, Aaron
    Nicolai, Garrett
    McCarthy, Arya D.
    Lewis, Dylan
    Wu, Winston
    Yarowsky, David
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3710 - 3718
  • [6] Towards a Low-Resource Neural Machine Translation for Indigenous Languages in Canada
    Ngoc Tan Le
    Sadat, Fatiha
    [J]. TRAITEMENT AUTOMATIQUE DES LANGUES, 2021, 62 (03): : 39 - 63
  • [7] Neural machine translation for low-resource languages without parallel corpora
    Karakanta, Alina
    Dehdari, Jon
    van Genabith, Josef
    [J]. MACHINE TRANSLATION, 2018, 32 (1-2) : 167 - 189
  • [8] Efficient Neural Machine Translation for Low-Resource Languages via Exploiting Related Languages
    Goyal, Vikrant
    Kumar, Sourav
    Sharma, Dipti Misra
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, 2020, : 162 - 168
  • [9] DRA: dynamic routing attention for neural machine translation with low-resource languages
    Wang, Zhenhan
    Song, Ran
    Yu, Zhengtao
    Mao, Cunli
    Gao, Shengxiang
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [10] OCR Improves Machine Translation for Low-Resource Languages
    Ignat, Oana
    Maillard, Jean
    Chaudhary, Vishrav
    Guzman, Francisco
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1164 - 1174