End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios

被引:20
|
作者
Bhosale, Swapnil [1 ]
Sheikh, Imran [1 ]
Dumpala, Sri Harsha [1 ]
Kopparapu, Sunil Kumar [1 ]
机构
[1] TCS Res & Innovat Mumbai, Mumbai, Maharashtra, India
来源
关键词
SLU; intent classification; low resource;
D O I
10.21437/Interspeech.2019-2366
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
End-to-end Spoken Language Understanding (SLU) systems, without speech-to-text conversion, are more promising in low resource scenarios. They can be more effective when there is not enough labeled data to train reliable speech recognition and language understanding systems, or where running SLU on edge is preferred over cloud based services. In this paper, we present an approach for bootstrapping end-to-end SLU in low resource scenarios. We show that incorporating layers extracted from pre-trained acoustic models, instead of using the typical Mel filter bank features, lead to better performing SLU models. Moreover, the layers extracted from a model pre-trained on one language perform well even for (a) SLU tasks on a different language and also (b) on utterances from speakers with speech disorder.
引用
收藏
页码:1188 / 1192
页数:5
相关论文
共 50 条
  • [1] Low resource end-to-end spoken language understanding with capsule networks
    Poncelet, Jakob
    Renkens, Vincent
    Van hamme, Hugo
    [J]. COMPUTER SPEECH AND LANGUAGE, 2021, 66
  • [2] TOWARDS END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Serdyuk, Dmitriy
    Wang, Yongqiang
    Fuegen, Christian
    Kumar, Anuj
    Liu, Baiyang
    Bengio, Yoshua
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5754 - 5758
  • [3] Toward Low-Cost End-to-End Spoken Language Understanding
    Dinarelli, Marco
    Naguib, Marco
    Portet, Francois
    [J]. INTERSPEECH 2022, 2022, : 2728 - 2732
  • [4] End-to-End Spoken Language Understanding: Performance analyses of a voice command task in a low resource setting
    Desot, Thierry
    Portet, Francois
    Vacher, Michel
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 75
  • [5] Semantic Complexity in End-to-End Spoken Language Understanding
    McKenna, Joseph P.
    Choudhary, Samridhi
    Saxon, Michael
    Strimel, Grant P.
    Mouchtaris, Athanasios
    [J]. INTERSPEECH 2020, 2020, : 4273 - 4277
  • [6] A Streaming End-to-End Framework For Spoken Language Understanding
    Potdar, Nihal
    Avila, Anderson R.
    Xing, Chao
    Wang, Dong
    Cao, Yiran
    Chen, Xiao
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3906 - 3914
  • [7] WhiSLU: End-to-End Spoken Language Understanding with Whisper
    Wang, Minghan
    Li, Yinglu
    Guo, Jiaxin
    Qiao, Xiaosong
    Li, Zongyao
    Shang, Hengchao
    Wei, Daimeng
    Tao, Shimin
    Zhang, Min
    Yang, Hao
    [J]. INTERSPEECH 2023, 2023, : 770 - 774
  • [8] Two-Pass Low Latency End-to-End Spoken Language Understanding
    Arora, Siddhant
    Dalmia, Siddharth
    Chang, Xuankai
    Yan, Brian
    Black, Alan
    Watanabe, Shinji
    [J]. INTERSPEECH 2022, 2022, : 3478 - 3482
  • [9] Low-bit Shift Network for End-to-End Spoken Language Understanding
    Avila, Anderson R.
    Bibi, Khalil
    Yang, Ruiheng
    Li, Xinlin
    Xing, Chao
    Chen, Xiao
    [J]. INTERSPEECH 2022, 2022, : 2698 - 2702
  • [10] End-to-End Spoken Language Understanding for Generalized Voice Assistants
    Saxon, Michael
    Choudhary, Samridhi
    McKenna, Joseph P.
    Mouchtaris, Athanasios
    [J]. INTERSPEECH 2021, 2021, : 4738 - 4742