End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

被引：0

作者：

Gaido, Marco ^{[1
,2
]}

Di Gangi, Mattia Antonino ^{[1
,2
]}

Negri, Matteo ^{[1
]}

Turchi, Marco ^{[1
]}

机构：

[1] Fdn Bruno Kessler, Trento, Italy

[2] Univ Trento, Trento, Italy

来源：

17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020) | 2020年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes FBK's participation in the IWSLT 2020 offline speech translation (ST) task. The task evaluates systems' ability to translate English TED talks audio into German texts. The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation. Participants can decide whether to work on custom segmentation or not. We used the provided segmentation. Our system is an end-to-end model based on an adaptation of the Transformer for speech data. Its training process is the main focus of this paper and it is based on: i) transfer learning (ASR pretraining and knowledge distillation), ii) data augmentation (SpecAugment, time stretch and synthetic data), iii) combining synthetic and real data marked as different domains, and iv) multitask learning using the CTC loss. Finally, after the training with word-level knowledge distillation is complete, our ST models are fine-tuned using label smoothed cross entropy. Our best model scored 29 BLEU on the MuST-C En-De test set, which is an excellent result compared to recent papers, and 23.7 BLEU on the same data segmented with VAD, showing the need for researching solutions addressing this specific data condition.

引用

页码：80 / 88

页数：9

共 50 条

[21] CKDST: Comprehensively and Effectively Distill Knowledge from Machine Translation to End-to-End Speech Translation
Lei, Yikun
Xue, Zhengshan
Sun, Haoran
Zhao, Xiaohu
Zhu, Shaolin
Lin, Xiaodong
Xiong, Deyi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3123 - 3137
[22] Knowledge Distillation from Multilingual and Monolingual Teachers for End-to-End Multilingual Speech Recognition
Xu, Jingyi
Hou, Junfeng
Song, Yan
Guo, Wu
Dai, Lirong
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 844 - 849
[23] MKD: Mixup-Based Knowledge Distillation for Mandarin End-to-End Speech Recognition
Wu, Xing
Jin, Yifan
Wang, Jianjia
Qian, Quan
Guo, Yike
ALGORITHMS, 2022, 15 (05)
[24] Knowledge Distillation from Offline to Streaming RNN Transducer for End-to-end Speech Recognition
Kurata, Gakuto
Saon, George
INTERSPEECH 2020, 2020, : 2117 - 2121
[25] Diverse Knowledge Distillation for End-to-End Person Search
Zhang, Xinyu
Wang, Xinlong
Bian, Jia-Wang
Shen, Chunhua
You, Mingyu
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3412 - 3420
[26] Efficient yet Competitive Speech Translation: FBK@IWSLT2022
Gaido, Marco
Papi, Sara
Fucci, Dennis
Fiameni, Giuseppe
Negri, Matteo
Turchi, Marco
PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 177 - 189
[27] MINTZAI: End-to-end Deep Learning for Speech Translation
Etchegoyhen, Thierry
Arzelus, Haritz
Gete, Harritxu
Alvarez, Aitor
Hernaez, Inma
Navas, Eva
Gonzalez-Docasal, Ander
Osacar, Jaime
Benites, Edson
Ellakuria, Igor
Calonge, Eusebi
Martin, Maite
PROCESAMIENTO DEL LENGUAJE NATURAL, 2020, (65): : 97 - 100
[28] Adaptive Feature Selection for End-to-End Speech Translation
Zhang, Biao
Titov, Ivan
Haddow, Barry
Sennrich, Rico
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2533 - 2544
[29] Speaker voice normalization for end-to-end speech translation
Xue, Zhengshan
Shi, Tingxun
Zhang, Xiaolei
Xiong, Deyi
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
[30] SimulSpeech: End-to-End Simultaneous Speech to Text Translation
Ren, Yi
Liu, Jinglin
Tan, Xu
Zhang, Chen
Qin, Tao
Zhao, Zhou
Liu, Tie-Yan
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3787 - 3796

← 1 2 3 4 5 →