Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications

被引:0
|
作者
Yao Qian
Rutuja Ubale
Patrick Lange
Keelan Evanini
Vikram Ramanarayanan
Frank K. Soong
机构
[1] Educational Testing Service Research,
[2] Educational Testing Service Research,undefined
[3] University of California,undefined
[4] Microsoft Research Asia,undefined
来源
关键词
Spoken language understanding; Human-machine conversational systems; Computer assisted language learning; End-to-end modeling; Education;
D O I
暂无
中图分类号
学科分类号
摘要
Spoken language understanding (SLU) in human machine conversational systems is the process of interpreting the semantic meaning conveyed by a user’s spoken utterance. Traditional SLU approaches transform the word string transcribed by an automatic speech recognition (ASR) system into a semantic label that determines the machine’s subsequent response. However, the robustness of SLU results can suffer in the context of a human-machine conversation-based language learning system due to the presence of ambient noise, heavily accented pronunciation, ungrammatical utterances, etc. To address these issues, this paper proposes an end-to-end (E2E) modeling approach for SLU and evaluates the semantic labeling performance of a bidirectional LSTM-RNN with input at three different levels: acoustic (filterbank features), phonetic (subphone posteriorgrams), and lexical (ASR hypotheses). Experimental results for spoken responses collected in a dialog application designed for English learners to practice job interviewing skills show that multi-level BLSTM-RNNs can utilize complementary information from the three different levels to improve the semantic labeling performance. An analysis of results on OOV utterances, which can be common in a conversation-based dialog system, also indicates that using subphone posteriorgrams outperforms ASR hypotheses and incorporating the lower-level features for semantic labeling can be advantageous to improving the final SLU performance.
引用
收藏
页码:805 / 817
页数:12
相关论文
共 50 条
  • [11] Multitask learning for spoken language understanding
    Tur, Gokhan
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 585 - 588
  • [12] Classifying Dialog Acts in Human-Human and Human-Machine Spoken Conversations
    Quarteroni, Silvia
    Riccardi, Giuseppe
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2518 - 2521
  • [13] Learning Dialogue History for Spoken Language Understanding
    Zhang, Xiaodong
    Ma, Dehong
    Wang, Houfeng
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 120 - 132
  • [14] Learning with noisy supervision for Spoken Language Understanding
    Raymond, Christian
    Riccardi, Giuseppe
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4989 - +
  • [15] Transfer Learning Methods for Spoken Language Understanding
    Wang, Xu
    Tang, Chengda
    Zhao, Xiaotian
    Li, Xuancai
    Jin, Zhuolin
    Zheng, Dequan
    Zhao, Tiejun
    ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 510 - 515
  • [16] Transfer Learning of Transformers for Spoken Language Understanding
    Svec, Jan
    Fremund, Adam
    Bulin, Martin
    Lehecka, Jan
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 489 - 500
  • [17] Structured learning for spoken language understanding in human-robot interaction
    Bastianelli, Emanuele
    Castellucci, Giuseppe
    Croce, Danilo
    Basili, Roberto
    Nardi, Daniele
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2017, 36 (5-7): : 660 - 683
  • [18] Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction
    Bianchi, Federico
    Greco, Ciro
    Tagliabue, Jacopo
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4409 - 4415
  • [19] ON THE USE OF MACHINE TRANSLATION FOR SPOKEN LANGUAGE UNDERSTANDING PORTABILITY
    Servan, Christophe
    Camelin, Nathalie
    Raymond, Christian
    Bechet, Frederic
    De Mori, Renato
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5330 - 5333
  • [20] UNDERSTANDING SPOKEN LANGUAGE
    BROWN, G
    TESOL QUARTERLY, 1978, 12 (03) : 271 - 283