Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation

被引:3
|
作者
Zhang, Xinlu [1 ,2 ,3 ]
Li, Xiao [1 ,2 ,3 ]
Yang, Yating [1 ,2 ,3 ]
Dong, Rui [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi 830011, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830011, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Decoding; Vocabulary; Task analysis; Standards; Knowledge engineering; Computational modeling; Neural machine translation; knowledge distillation; prior knowledge;
D O I
10.1109/ACCESS.2020.3037821
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge Distillation (KD) aims to distill the knowledge of a cumbersome teacher model into a lightweight student model. Its success is generally attributed to the privileged information on similarities among categories provided by the teacher model, and in this sense, only strong teacher models are deployed to teach weaker students in practice. However, in low-resource neural machine translation, a stronger teacher model is not available. To counteract this, We therefore propose a novel Teacher-free Knowledge Distillation framework for low-resource neural machine translation, where the model learns from manually designed regularization distribution as a virtual teacher model. The prior distribution of artificial design can not only obtain the similarity information between words, but also provide effective regularity for model training. Experimental results show that the proposed method has improved performance in low-resource language effectively.
引用
收藏
页码:206638 / 206645
页数:8
相关论文
共 50 条
  • [41] Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
    Choi, Gyu-Hyeon
    Shin, Jong-Hun
    Kim, Young-Kil
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 900 - 904
  • [42] Enhancing distant low-resource neural machine translation with semantic pivot
    Zhu, Enchang
    Huang, Yuxin
    Xian, Yantuan
    Zhu, Junguo
    Gao, Minghu
    Yu, Zhiqiang
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 116 : 633 - 643
  • [43] Translation Memories as Baselines for Low-Resource Machine Translation
    Knowles, Rebecca
    Littell, Patrick
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6759 - 6767
  • [44] Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation
    Zhang, Wenbo
    Li, Xiao
    Yang, Yating
    Dong, Rui
    Luo, Gongxu
    FUTURE INTERNET, 2020, 12 (12): : 1 - 13
  • [45] Harnessing Knowledge Distillation for Enhanced Text-to-Text Translation in Low-Resource Languages
    Ahmed, Manar Ouled
    Ming, Zuheng
    Othmani, Alice
    SPEECH AND COMPUTER, SPECOM 2024, PT II, 2025, 15300 : 295 - 307
  • [46] Teacher-free Distillation via Regularizing Intermediate Representation
    Li, Lujun
    Liang, Shiuan-Ni
    Yang, Ya
    Jin, Zhe
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [47] Dual Knowledge Distillation for neural machine translation
    Wan, Yuxian
    Zhang, Wenlin
    Li, Zhen
    Zhang, Hao
    Li, Yanxia
    COMPUTER SPEECH AND LANGUAGE, 2024, 84
  • [48] Continual Knowledge Distillation for Neural Machine Translation
    Zhang, Yuanchi
    Li, Peng
    Sun, Maosong
    Liu, Yang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7978 - 7996
  • [49] Machine Translation into Low-resource Language Varieties
    Kumar, Sachin
    Anastasopoulos, Antonios
    Wintner, Shuly
    Tsvetkov, Yulia
    ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 110 - 121
  • [50] Improving neural machine translation for low-resource Indian languages using rule-based feature extraction
    Muskaan Singh
    Ravinder Kumar
    Inderveer Chana
    Neural Computing and Applications, 2021, 33 : 1103 - 1122