Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding

被引:0
|
作者
Cappellazzo, Umberto [1 ]
Yang, Muqiao [2 ]
Falavigna, Daniele [3 ]
Brutti, Alessio [3 ]
机构
[1] Univ Trento, Trento, Italy
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] Fdn Bruno Kessler, Trento, Italy
来源
关键词
continual learning; spoken language understanding; knowledge distillation; transformer; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2023-242
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The ability to learn new concepts sequentially is a major weakness for modern neural networks, which hinders their use in non-stationary environments. Their propensity to fit the current data distribution to the detriment of the past acquired knowledge leads to the catastrophic forgetting issue. In this work we tackle the problem of Spoken Language Understanding applied to a continual learning setting. We first define a class-incremental scenario for the SLURP dataset. Then, we propose three knowledge distillation (KD) approaches to mitigate forgetting for a sequence-to-sequence transformer model: the first KD method is applied to the encoder output (audio-KD), and the other two work on the decoder output, either directly on the token-level (tok-KD) or on the sequence-level (seq-KD) distributions. We show that the seq-KD substantially improves all the performance metrics, and its combination with the audioKD further decreases the average WER and enhances the entity prediction metric.
引用
收藏
页码:2953 / 2957
页数:5
相关论文
共 50 条
  • [21] END-TO-END ARCHITECTURES FOR ASR-FREE SPOKEN LANGUAGE UNDERSTANDING
    Palogiannidi, Elisavet
    Gkinis, Ioannis
    Mastrapas, George
    Mizera, Petr
    Stafylakis, Themos
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7974 - 7978
  • [22] FROM AUDIO TO SEMANTICS: APPROACHES TO END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Haghani, Parisa
    Narayanan, Arun
    Bacchiani, Michiel
    Chuang, Galen
    Gaur, Neeraj
    Moreno, Pedro
    Prabhavalkar, Rohit
    Qu, Zhongdi
    Waters, Austin
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 720 - 726
  • [23] Toward Low-Cost End-to-End Spoken Language Understanding
    Dinarelli, Marco
    Naguib, Marco
    Portet, Francois
    INTERSPEECH 2022, 2022, : 2728 - 2732
  • [24] Low resource end-to-end spoken language understanding with capsule networks
    Poncelet, Jakob
    Renkens, Vincent
    Van hamme, Hugo
    COMPUTER SPEECH AND LANGUAGE, 2021, 66
  • [25] TOP-DOWN ATTENTION IN END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Chen, Yixin
    Lu, Weiyi
    Mottini, Alejandro
    Li, Li Erran
    Droppo, Jasha
    Du, Zheng
    Zeng, Belinda
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6199 - 6203
  • [26] SPEECH-LANGUAGE PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Qian, Yao
    Bianv, Ximo
    Shi, Yu
    Kanda, Naoyuki
    Shen, Leo
    Xiao, Zhen
    Zeng, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7458 - 7462
  • [27] Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
    Kim, Suyoun
    Shrivastava, Akshat
    Duc Le
    Lin, Ju
    Kalinli, Ozlem
    Seltzer, Michael L.
    INTERSPEECH 2023, 2023, : 1119 - 1123
  • [28] USING SPEECH SYNTHESIS TO TRAIN END-TO-END SPOKEN LANGUAGE UNDERSTANDING MODELS
    Lugosch, Loren
    Meyer, Brett H.
    Nowrouzezahrai, Derek
    Ravanelli, Mirco
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8499 - 8503
  • [29] Two-Pass Low Latency End-to-End Spoken Language Understanding
    Arora, Siddhant
    Dalmia, Siddharth
    Chang, Xuankai
    Yan, Brian
    Black, Alan
    Watanabe, Shinji
    INTERSPEECH 2022, 2022, : 3478 - 3482
  • [30] Low-bit Shift Network for End-to-End Spoken Language Understanding
    Avila, Anderson R.
    Bibi, Khalil
    Yang, Ruiheng
    Li, Xinlin
    Xing, Chao
    Chen, Xiao
    INTERSPEECH 2022, 2022, : 2698 - 2702