Fine-Tuning Self-Supervised Multilingual Sequence-To-Sequence Models for Extremely Low-Resource NMT

被引:0
|
作者
Thillainathan, Sarubi [1 ]
Ranathunga, Surangika [1 ]
Jayasena, Sanath [1 ]
机构
[1] Univ Moratuwa, Dept Comp Sci & Engn, Katubedda, Sri Lanka
关键词
neural machine translation; pre-trained models; fine-tuning; denoising autoencoder; low-resource languages; NEURAL MACHINE TRANSLATION; SINHALA; TAMIL;
D O I
10.1109/MERCON52712.2021.9525720
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Neural Machine Translation (NMT) tends to perform poorly in low-resource language settings due to the scarcity of parallel data. Instead of relying on inadequate parallel corpora, we can take advantage of monolingual data available in abundance. Training a denoising self-supervised multilingual sequence-to-sequence model by noising the available large scale monolingual corpora is one way to utilize monolingual data. For a pair of languages for which monolingual data is available in such a pre-trained multilingual denoising model, the model can be fine-tuned with a smaller amount of parallel data from this language pair. This paper presents fine-tuning self-supervised multilingual sequence-to-sequence pre-trained models for extremely low-resource domain-specific NMT settings. We choose one such pre-trained model: mBART. We are the first to implement and demonstrate the viability of non-English centric complete fine-tuning on multilingual sequence-to-sequence pretrained models. We select Sinhala, Tamil and English languages to demonstrate fine-tuning on extremely low-resource settings in the domain of official government documents. Experiments show that our fine-tuned mBART model significantly outperforms state-of-the-art Transformer based NMT models in all pairs in all six bilingual directions, where we report a 4.41 BLEU score increase on Tamil -> Sinhala and a 2.85 BLUE increase on Sinhala -> Tamil translation.
引用
收藏
页码:432 / 437
页数:6
相关论文
共 50 条
  • [1] Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
    Lee, En-Shiun Annie
    Thillainathan, Sarubi
    Nayak, Shravan
    Ranathunga, Surangika
    Adelani, David Ifeoluwa
    Su, Ruisi
    McCarthy, Arya D.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 58 - 67
  • [2] ScoutWav: Two-Step Fine-Tuning on Self-Supervised Automatic Speech Recognition for Low-Resource Environments
    Fatehi, Kavan
    Torres, Mercedes Torres
    Kucukyilmaz, Ayse
    INTERSPEECH 2022, 2022, : 3523 - 3527
  • [3] Fine-tuning pretrained transformer encoders for sequence-to-sequence learning
    Hangbo Bao
    Li Dong
    Wenhui Wang
    Nan Yang
    Songhao Piao
    Furu Wei
    International Journal of Machine Learning and Cybernetics, 2024, 15 : 1711 - 1728
  • [4] Fine-tuning pretrained transformer encoders for sequence-to-sequence learning
    Bao, Hangbo
    Dong, Li
    Wang, Wenhui
    Yang, Nan
    Piao, Songhao
    Wei, Furu
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (05) : 1711 - 1728
  • [5] A Dataset for Low-Resource Stylized Sequence-to-Sequence Generation
    Wu, Yu
    Wang, Yunli
    Liu, Shujie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9290 - 9297
  • [6] Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
    Boito, Marcely Zanon
    Villavicencio, Aline
    Besacier, Laurent
    INTERSPEECH 2019, 2019, : 2688 - 2692
  • [7] adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource Languages with Integrated LLM Playgrounds
    Lankford, Seamus
    Afli, Haithem
    Way, Andy
    INFORMATION, 2023, 14 (12)
  • [8] Improving fine-tuning of self-supervised models with Contrastive Initialization
    Pan, Haolin
    Guo, Yong
    Deng, Qinyi
    Yang, Haomin
    Chen, Jian
    Chen, Yiqun
    NEURAL NETWORKS, 2023, 159 : 198 - 207
  • [9] Multilingual unsupervised sequence segmentation transfers to extremely low-resource languages
    Downey, C. M.
    Drizin, Shannon
    Haroutunian, Levon
    Thukral, Shivin
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 5331 - 5346
  • [10] Lexicon-based fine-tuning of multilingual language models for low-resource language sentiment analysis
    Dhananjaya, Vinura
    Ranathunga, Surangika
    Jayasena, Sanath
    CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024, 9 (05) : 1116 - 1125