Joint Discriminative Decoding of Words and Semantic Tags for Spoken Language Understanding

被引:11
|
作者
Deoras, Anoop [1 ]
Tur, Gokhan [1 ]
Sarikaya, Ruhi [1 ]
Hakkani-Tuer, Dilek [1 ]
机构
[1] Microsoft Corp, Mountain View, CA 94041 USA
关键词
Joint Decoding; MaxEnt; CRF; SLU; ASR; lattice decoding; spoken language processing; speech and dialog understanding;
D O I
10.1109/TASL.2013.2256894
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Most Spoken Language Understanding (SLU) systems today employ a cascade approach, where the best hypothesis from Automatic Speech Recognizer (ASR) is fed into understanding modules such as slot sequence classifiers and intent detectors. The output of these modules is then further fed into downstream components such as interpreter and/or knowledge broker. These statistical models are usually trained individually to optimize the error rate of their respective output. In such approaches, errors from one module irreversibly propagates into other modules causing a serious degradation in the overall performance of the SLU system. Thus it is desirable to jointly optimize all the statistical models together. As a first step towards this, in this paper, we propose a joint decoding framework in which we predict the optimal word as well as slot sequence (semantic tag sequence) jointly given the input acoustic stream. Furthermore, the improved recognition output is then used for an utterance classification task, specifically, we focus on intent detection task. On a SLU task, we show 1.5% absolute reduction (7.6% relative reduction) in word error rate (WER) and 1.2% absolute improvement in F measure for slot prediction when compared to a very strong cascade baseline comprising of state-of-the-art large vocabulary ASR followed by conditional random field (CRF) based slot sequence tagger. Similarly, for intent detection, we show 1.2% absolute reduction (12% relative reduction) in classification error rate.
引用
下载
收藏
页码:1612 / 1621
页数:10
相关论文
共 50 条
  • [31] SPOKEN LANGUAGE UNDERSTANDING FROM UNALIGNED DATA USING DISCRIMINATIVE CLASSIFICATION MODELS
    Mairesse, F.
    Gasic, M.
    Jurcicek, F.
    Keizer, S.
    Thomson, B.
    Yu, K.
    Young, S.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4749 - 4752
  • [32] Discriminative vector for spoken language recognition
    Ma, Bin
    Tong, Rong
    Li, Haizhou
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1001 - +
  • [33] Learning Semantic Hierarchy with Distributed Representations for Unsupervised Spoken Language Understanding
    Chen, Yun-Nung
    Wang, William Yang
    Rudnicky, Alexander I.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1869 - 1873
  • [34] Exploring the Correlation of Pitch Accents and Semantic Slots for Spoken Language Understanding
    Stehwien, Sabrina
    Ngoc Thang Vu
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 730 - 734
  • [35] Deep Belief Network based Semantic Taggers for Spoken Language Understanding
    Deoras, Anoop
    Sarikaya, Ruhi
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2712 - 2716
  • [36] UNDERSTANDING SPOKEN LANGUAGE
    BROWN, G
    TESOL QUARTERLY, 1978, 12 (03) : 271 - 283
  • [37] Spoken language understanding
    Wang, YY
    Deng, L
    Acero, A
    IEEE SIGNAL PROCESSING MAGAZINE, 2005, 22 (05) : 16 - 31
  • [38] Brain-Based Translation: fMRI Decoding of Spoken Words in Bilinguals Reveals Language-Independent Semantic Representations in Anterior Temporal Lobe
    Correia, Joao
    Formisano, Elia
    Valente, Giancarlo
    Hausfeld, Lars
    Jansma, Bernadette
    Bonte, Milene
    JOURNAL OF NEUROSCIENCE, 2014, 34 (01): : 332 - 338
  • [39] SEMANTIC CONSTRAINT ON DECODING OF AMBIGUOUS WORDS
    PERFETTI, CA
    GOODMAN, D
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1970, 86 (03): : 420 - &
  • [40] SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding
    Chung, Yu-An
    Zhu, Chenguang
    Zeng, Michael
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1897 - 1907