Harnessing Knowledge Distillation for Enhanced Text-to-Text Translation in Low-Resource Languages

被引:0
|
作者
Ahmed, Manar Ouled [1 ]
Ming, Zuheng [3 ]
Othmani, Alice [2 ,4 ]
机构
[1] Declic AI Res, Riyadh, Saudi Arabia
[2] Deck AI Res, Melbourne, Vic, Australia
[3] Univ Sorbonne Paris Nord, L2TI, Villetaneuse, France
[4] Univ Paris Est, UPEC, LISSI, Vitry Sur Seine, France
来源
关键词
Text-to-text; BART; Low-resource languages;
D O I
10.1007/978-3-031-78014-1_22
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Text-to-text translation is crucial for effective communication and understanding across different languages. In this paper, we present a deep learning-based approach for text-to-text translation. Our method leverages knowledge distillation from a high-performing teacher model, specifically the BART model, to train a smaller and more efficient student model, the mBART model. For that, we minimize the cross-entropy between the model distribution and a learned teacher distribution rather than the observed data, to achieve effective knowledge distillation. Our approach mitigates catastrophic forgetting, especially in low-resource languages, by utilizing the complementary knowledge provided by the teacher model. Extensive experimentation and evaluation demonstrate that our model outperforms state-of-the-art methods, achieving superior BLEU scores on benchmark datasets for French-to-Russian, English-to-Dutch, and Russian-to-Vietnamese translations. An ablation study further shows that the combination of fine-tuning and knowledge distillation enhances the student model's ability to capture linguistic nuances and produce more accurate translations.
引用
收藏
页码:295 / 307
页数:13
相关论文
共 50 条
  • [1] Low-Resource Speech-to-Text Translation
    Bansal, Sameer
    Kamper, Herman
    Livescu, Karen
    Lopez, Adam
    Goldwater, Sharon
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1298 - 1302
  • [2] Weakly supervised scene text generation for low-resource languages
    Xie, Yangchen
    Chen, Xinyuan
    Zhan, Hongjian
    Shivakumara, Palaiahnakote
    Yin, Bing
    Liu, Cong
    Lu, Yue
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [3] KNOWLEDGE DISTILLATION ACROSS ENSEMBLES OF MULTILINGUAL MODELS FOR LOW-RESOURCE LANGUAGES
    Cui, Jia
    Kingsbury, Brian
    Ramabhadran, Bhuvana
    Saon, George
    Sercu, Tom
    Audhkhasi, Kartik
    Sethy, Abhinav
    Nussbaum-Thom, Markus
    Rosenberg, Andrew
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4825 - 4829
  • [4] Leveraging sensory knowledge into Text-to-Text Transfer Transformer for enhanced emotion analysis
    Zhao, Qingqing
    Xia, Yuhan
    Long, Yunfei
    Xu, Ge
    Wang, Jia
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
  • [5] ANALYZING ASR PRETRAINING FOR LOW-RESOURCE SPEECH-TO-TEXT TRANSLATION
    Stoian, Mihaela C.
    Bansal, Sameer
    Goldwater, Sharon
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7909 - 7913
  • [6] Text data augmentation and pre-trained Language Model for enhancing text classification of low-resource languages
    Ziyaden, Atabay
    Yelenov, Amir
    Hajiyev, Fuad
    Rustamov, Samir
    Pak, Alexandr
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [7] Text-to-text machine translation using the RECONTRA connectionist model
    Castaño, MA
    Casacuberta, F
    ENGINEERING APPLICATIONS OF BIO-INSPIRED ARTIFICIAL NEURAL NETWORKS, VOL II, 1999, 1607 : 683 - 692
  • [8] Enabling Medical Translation for Low-Resource Languages
    Musleh, Ahmad
    Durrani, Nadir
    Temnikova, Irina
    Nakov, Preslav
    Vogel, Stephan
    Alsaad, Osama
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II, 2018, 9624 : 3 - 16
  • [9] Text-to-text generative approach for enhanced complex word identification
    Sliwiak, Patrycja
    Shah, Syed Afaq Ali
    NEUROCOMPUTING, 2024, 610
  • [10] Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages
    Gupta, Shivanshu
    Matsubara, Yoshitomo
    Chadha, Ankit
    Moschitti, Alessandro
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 14078 - 14092