Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications

被引:5
|
作者
Qian, Yao [1 ]
Ubale, Rutuja [1 ]
Lange, Patrick [1 ]
Evanini, Keelan [2 ]
Ramanarayanan, Vikram [1 ,3 ]
Soong, Frank K. [4 ]
机构
[1] Educ Testing Serv Res, San Francisco, CA 94134 USA
[2] Educ Testing Serv Res, Princeton, NJ USA
[3] Univ Calif San Francisco, San Francisco, CA 94143 USA
[4] Microsoft Res Asia, Beijing, Peoples R China
关键词
Spoken language understanding; Human-machine conversational systems; Computer assisted language learning; End-to-end modeling; Education;
D O I
10.1007/s11265-019-01484-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spoken language understanding (SLU) in human machine conversational systems is the process of interpreting the semantic meaning conveyed by a user's spoken utterance. Traditional SLU approaches transform the word string transcribed by an automatic speech recognition (ASR) system into a semantic label that determines the machine's subsequent response. However, the robustness of SLU results can suffer in the context of a human-machine conversation-based language learning system due to the presence of ambient noise, heavily accented pronunciation, ungrammatical utterances, etc. To address these issues, this paper proposes an end-to-end (E2E) modeling approach for SLU and evaluates the semantic labeling performance of a bidirectional LSTM-RNN with input at three different levels: acoustic (filterbank features), phonetic (subphone posteriorgrams), and lexical (ASR hypotheses). Experimental results for spoken responses collected in a dialog application designed for English learners to practice job interviewing skills show that multi-level BLSTM-RNNs can utilize complementary information from the three different levels to improve the semantic labeling performance. An analysis of results on OOV utterances, which can be common in a conversation-based dialog system, also indicates that using subphone posteriorgrams outperforms ASR hypotheses and incorporating the lower-level features for semantic labeling can be advantageous to improving the final SLU performance.
引用
收藏
页码:805 / 817
页数:13
相关论文
共 50 条
  • [1] Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications
    Yao Qian
    Rutuja Ubale
    Patrick Lange
    Keelan Evanini
    Vikram Ramanarayanan
    Frank K. Soong
    Journal of Signal Processing Systems, 2020, 92 : 805 - 817
  • [2] Robots that learn language: Developmental approach to human-machine conversations
    Iwahashi, Naoto
    SYMBOL GROUNDING AND BEYOND, PROCEEDINGS, 2006, 4211 : 143 - 167
  • [3] Automatic recognition and understanding of spoken language - A first step toward natural human-machine communication
    Juang, BH
    Furui, S
    PROCEEDINGS OF THE IEEE, 2000, 88 (08) : 1142 - 1165
  • [4] Spoken language understanding software for language learning
    Alam, Hassan
    Kumar, Aman
    Rahman, Fuad
    Hartono, Rachmat
    Tarnikova, Yuliya
    INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL II, 2007, : 107 - +
  • [5] Applications of Statistical Machine Translation Approaches to Spoken Language Understanding
    Macherey, Klaus
    Bender, Oliver
    Ney, Hermann
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 803 - 818
  • [6] Spoken language understanding and interaction: machine learning for human-like conversational systems
    Gasic, Milica
    Hakkani-Tur, Dilek
    Celikyilmaz, Asli
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 249 - 251
  • [7] A Human-Machine Language Dictionary
    Fei Liu
    Shirin Akther Khanam
    Yi-Ping Phoebe Chen
    International Journal of Computational Intelligence Systems, 2020, 13 : 904 - 913
  • [8] A Human-Machine Language Dictionary
    Liu, Fei
    Khanam, Shirin Akther
    Chen, Yi-Ping Phoebe
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2020, 13 (01) : 904 - 913
  • [9] Active learning for spoken language understanding
    Tur, G
    Schapire, RE
    Hakkani-Tür, D
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 276 - 279
  • [10] Grammar learning for spoken language understanding
    Wang, YY
    Acero, A
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 292 - 295