LEARNING ASR-ROBUST CONTEXTUALIZED EMBEDDINGS FOR SPOKEN LANGUAGE UNDERSTANDING

被引:0
|
作者
Huang, Chao-Wei [1 ]
Chen, Yun-Nung [1 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
关键词
spoken language understanding; contextualized embedding; ASR robustness; RECURRENT NEURAL-NETWORKS;
D O I
10.1109/icassp40776.2020.9054689
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Employing pre-trained language models (LM) to extract contextualized word representations has achieved state-of-the-art performance on various NLP tasks. However, applying this technique to noisy transcripts generated by automatic speech recognizer (ASR) is concerned. Therefore, this paper focuses on making contextualized representations more ASR-robust. We propose a novel confusion-aware fine-tuning method to mitigate the impact of ASR errors on pre-trained LMs. Specifically, we fine-tune LMs to produce similar representations for acoustically confusable words that are obtained from word confusion networks (WCNs) produced by ASR. Experiments on multiple benchmark datasets show that the proposed method significantly improves the performance of spoken language understanding when performing on ASR transcripts(1).
引用
收藏
页码:8009 / 8013
页数:5
相关论文
共 50 条
  • [31] Transfer Learning Methods for Spoken Language Understanding
    Wang, Xu
    Tang, Chengda
    Zhao, Xiaotian
    Li, Xuancai
    Jin, Zhuolin
    Zheng, Dequan
    Zhao, Tiejun
    ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 510 - 515
  • [32] Transfer Learning of Transformers for Spoken Language Understanding
    Svec, Jan
    Fremund, Adam
    Bulin, Martin
    Lehecka, Jan
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 489 - 500
  • [33] Contrastive Learning Based ASR Robust Knowledge Selection For Spoken Dialogue System
    Zhu, Zhiyuan
    Liao, Yusheng
    Wang, Yu
    Guan, Yunfeng
    INTERSPEECH 2023, 2023, : 725 - 729
  • [34] A low latency ASR-free end to end spoken language understanding system
    Mhiri, Mohamed
    Myer, Samuel
    Tomar, Vikrant Singh
    INTERSPEECH 2020, 2020, : 1947 - 1951
  • [35] Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
    Kim, Suyoun
    Arora, Abhinav
    Duc Le
    Yeh, Ching-Feng
    Fuegen, Christian
    Kalinli, Ozlem
    Seltzer, Michael L.
    INTERSPEECH 2021, 2021, : 1977 - 1981
  • [36] Robust dependency parsing for Spoken Language Understanding of spontaneous speech
    Bechet, Frederic
    Nasr, Alexis
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1027 - +
  • [37] MULTITASK LEARNING FOR LOW RESOURCE SPOKEN LANGUAGE UNDERSTANDING
    Meeus, Quentin
    Moens, Marie Francine
    Van Hamme, Hugo
    INTERSPEECH 2022, 2022, : 4073 - 4077
  • [38] A Joint Learning Framework With BERT for Spoken Language Understanding
    Zhang, Zhichang
    Zhang, Zhenwen
    Chen, Haoyuan
    Zhang, Zhiman
    IEEE ACCESS, 2019, 7 : 168849 - 168858
  • [39] Research on Spoken Language Understanding Based on Deep Learning
    Yanli Hui
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [40] Spoken language understanding using weakly supervised learning
    Wu, Wei-Lin
    Lu, Ru-Zhan
    Duan, Jian-Yong
    Liu, Hui
    Gao, Feng
    Chen, Yu-Quan
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (02): : 358 - 382