End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

被引:0
|
作者
Gaido, Marco [1 ,2 ]
Di Gangi, Mattia Antonino [1 ,2 ]
Negri, Matteo [1 ]
Turchi, Marco [1 ]
机构
[1] Fdn Bruno Kessler, Trento, Italy
[2] Univ Trento, Trento, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes FBK's participation in the IWSLT 2020 offline speech translation (ST) task. The task evaluates systems' ability to translate English TED talks audio into German texts. The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation. Participants can decide whether to work on custom segmentation or not. We used the provided segmentation. Our system is an end-to-end model based on an adaptation of the Transformer for speech data. Its training process is the main focus of this paper and it is based on: i) transfer learning (ASR pretraining and knowledge distillation), ii) data augmentation (SpecAugment, time stretch and synthetic data), iii) combining synthetic and real data marked as different domains, and iv) multitask learning using the CTC loss. Finally, after the training with word-level knowledge distillation is complete, our ST models are fine-tuned using label smoothed cross entropy. Our best model scored 29 BLEU on the MuST-C En-De test set, which is an excellent result compared to recent papers, and 23.7 BLEU on the same data segmented with VAD, showing the need for researching solutions addressing this specific data condition.
引用
收藏
页码:80 / 88
页数:9
相关论文
共 50 条
  • [1] End-to-End Speech Translation with Knowledge Distillation
    Liu, Yuchen
    Xiong, Hao
    Zhang, Jiajun
    He, Zhongjun
    Wu, Hua
    Wang, Haifeng
    Zong, Chengqing
    INTERSPEECH 2019, 2019, : 1128 - 1132
  • [2] SRPOL's System for the IWSLT 2020 End-to-End Speech Translation Task
    Potapczyk, Tomasz
    Przybysz, Pawel
    17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 89 - 94
  • [3] DiDi Labs' End-to-End System for the IWSLT 2020 Offline Speech Translation Task
    Arkhangorodsky, Arkady
    Huang, Yiqi
    Axelrod, Amittai
    17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 69 - 72
  • [4] Knowledge Distillation on Joint Task End-to-End Speech Translation
    Nayem, Khandokar Md
    Xue, Ran
    Chang, Ching-Yun
    Shanbhogue, Akshaya Vishnu Kudlu
    INTERSPEECH 2023, 2023, : 1493 - 1497
  • [5] ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
    Elbayad, Maha
    Ha Nguyen
    Bougares, Fethi
    Tomashenko, Natalia
    Caubriere, Antoine
    Lecouteux, Benjamin
    Esteve, Yannick
    Besacier, Laurent
    17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 35 - 43
  • [6] Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation
    Inaguma, Hirofumi
    Kawahara, Tatsuya
    Watanabe, Shinji
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1872 - 1881
  • [7] The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
    Xu, Chen
    Liu, Xiaoqian
    Liu, Xiaowen
    Wang, Laohu
    Huang, Canan
    Xiao, Tong
    Zhu, Jingbo
    IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, 2021, : 92 - 99
  • [8] Edinburgh's End-to-End Multilingual Speech Translation System for IWSLT 2021
    Zhang, Biao
    Sennrich, Rico
    IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION, 2021, : 160 - 168
  • [9] End-to-End Offline Speech Translation System for IWSLT 2020 using Modality Agnostic Meta-Learning
    Lakumarapu, Nikhil Kumar
    Lee, Beomseok
    Indurthi, Sathish
    Han, Houjeung
    Zaidi, Mohd Abbas
    Kim, Sangha
    17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 73 - 79
  • [10] The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
    Zhang, Ziqiang
    Ao, Junyi
    IWSLT 2022 - 19th International Conference on Spoken Language Translation, Proceedings of the Conference, 2022, : 158 - 168