Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding

被引:0
|
作者
Cappellazzo, Umberto [1 ]
Yang, Muqiao [2 ]
Falavigna, Daniele [3 ]
Brutti, Alessio [3 ]
机构
[1] Univ Trento, Trento, Italy
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] Fdn Bruno Kessler, Trento, Italy
来源
关键词
continual learning; spoken language understanding; knowledge distillation; transformer; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2023-242
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The ability to learn new concepts sequentially is a major weakness for modern neural networks, which hinders their use in non-stationary environments. Their propensity to fit the current data distribution to the detriment of the past acquired knowledge leads to the catastrophic forgetting issue. In this work we tackle the problem of Spoken Language Understanding applied to a continual learning setting. We first define a class-incremental scenario for the SLURP dataset. Then, we propose three knowledge distillation (KD) approaches to mitigate forgetting for a sequence-to-sequence transformer model: the first KD method is applied to the encoder output (audio-KD), and the other two work on the decoder output, either directly on the token-level (tok-KD) or on the sequence-level (seq-KD) distributions. We show that the seq-KD substantially improves all the performance metrics, and its combination with the audioKD further decreases the average WER and enhances the entity prediction metric.
引用
收藏
页码:2953 / 2957
页数:5
相关论文
共 50 条
  • [1] AN END-TO-END ARCHITECTURE FOR CLASS-INCREMENTAL OBJECT DETECTION WITH KNOWLEDGE DISTILLATION
    Hao, Yu
    Fu, Yanwei
    Jiang, Yu-Gang
    Tian, Qi
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1 - 6
  • [2] TWO-STAGE TEXTUAL KNOWLEDGE DISTILLATION FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Kim, Seongbin
    Kim, Gyuwan
    Shin, Seongjin
    Lee, Sangmin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7463 - 7467
  • [3] TOWARDS END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Serdyuk, Dmitriy
    Wang, Yongqiang
    Fuegen, Christian
    Kumar, Anuj
    Liu, Baiyang
    Bengio, Yoshua
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5754 - 5758
  • [4] ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding
    Sunder, Vishal
    Fosler-Lussier, Eric
    Thomas, Samuel
    Kuo, Hong-Kwang J.
    Kingsbury, Brian
    INTERSPEECH 2023, 2023, : 1129 - 1133
  • [5] A Streaming End-to-End Framework For Spoken Language Understanding
    Potdar, Nihal
    Avila, Anderson R.
    Xing, Chao
    Wang, Dong
    Cao, Yiran
    Chen, Xiao
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3906 - 3914
  • [6] Semantic Complexity in End-to-End Spoken Language Understanding
    McKenna, Joseph P.
    Choudhary, Samridhi
    Saxon, Michael
    Strimel, Grant P.
    Mouchtaris, Athanasios
    INTERSPEECH 2020, 2020, : 4273 - 4277
  • [7] WhiSLU: End-to-End Spoken Language Understanding with Whisper
    Wang, Minghan
    Li, Yinglu
    Guo, Jiaxin
    Qiao, Xiaosong
    Li, Zongyao
    Shang, Hengchao
    Wei, Daimeng
    Tao, Shimin
    Zhang, Min
    Yang, Hao
    INTERSPEECH 2023, 2023, : 770 - 774
  • [8] End-to-End Spoken Language Understanding Without Full Transcripts
    Kuo, Hong-Kwang J.
    Tuske, Zoltan
    Thomas, Samuel
    Huang, Yinghui
    Audhkhasi, Kartik
    Kingsbury, Brian
    Kurata, Gakuto
    Kons, Zvi
    Hoory, Ron
    Lastras, Luis
    INTERSPEECH 2020, 2020, : 906 - 910
  • [9] End-to-End Spoken Language Understanding for Generalized Voice Assistants
    Saxon, Michael
    Choudhary, Samridhi
    McKenna, Joseph P.
    Mouchtaris, Athanasios
    INTERSPEECH 2021, 2021, : 4738 - 4742
  • [10] End-to-End Neural Transformer Based Spoken Language Understanding
    Radfar, Martin
    Mouchtaris, Athanasios
    Kunzmann, Siegfried
    INTERSPEECH 2020, 2020, : 866 - 870