Joint Discriminative Decoding of Words and Semantic Tags for Spoken Language Understanding

被引:11
|
作者
Deoras, Anoop [1 ]
Tur, Gokhan [1 ]
Sarikaya, Ruhi [1 ]
Hakkani-Tuer, Dilek [1 ]
机构
[1] Microsoft Corp, Mountain View, CA 94041 USA
关键词
Joint Decoding; MaxEnt; CRF; SLU; ASR; lattice decoding; spoken language processing; speech and dialog understanding;
D O I
10.1109/TASL.2013.2256894
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Most Spoken Language Understanding (SLU) systems today employ a cascade approach, where the best hypothesis from Automatic Speech Recognizer (ASR) is fed into understanding modules such as slot sequence classifiers and intent detectors. The output of these modules is then further fed into downstream components such as interpreter and/or knowledge broker. These statistical models are usually trained individually to optimize the error rate of their respective output. In such approaches, errors from one module irreversibly propagates into other modules causing a serious degradation in the overall performance of the SLU system. Thus it is desirable to jointly optimize all the statistical models together. As a first step towards this, in this paper, we propose a joint decoding framework in which we predict the optimal word as well as slot sequence (semantic tag sequence) jointly given the input acoustic stream. Furthermore, the improved recognition output is then used for an utterance classification task, specifically, we focus on intent detection task. On a SLU task, we show 1.5% absolute reduction (7.6% relative reduction) in word error rate (WER) and 1.2% absolute improvement in F measure for slot prediction when compared to a very strong cascade baseline comprising of state-of-the-art large vocabulary ASR followed by conditional random field (CRF) based slot sequence tagger. Similarly, for intent detection, we show 1.2% absolute reduction (12% relative reduction) in classification error rate.
引用
收藏
页码:1612 / 1621
页数:10
相关论文
共 50 条
  • [1] Error-Corrective Discriminative Joint Decoding of Automatic Spoken Language Transcription and Understanding
    Jabaian, Bassani
    Lefevre, Fabrice
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2717 - 2721
  • [2] JOINT GENERATIVE AND DISCRIMINATIVE MODELS FOR SPOKEN LANGUAGE UNDERSTANDING
    Dinarelli, Marco
    Moschitti, Alessandro
    Riccardi, Giuseppe
    [J]. 2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 61 - 64
  • [3] Semantic Role Labeling with Discriminative Feature Selection for Spoken Language Understanding
    Liu, Chao-Hong
    Wu, Chung-Hsien
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1039 - 1042
  • [4] Discriminative Reranking for Spoken Language Understanding
    Dinarelli, Marco
    Moschitti, Alessandro
    Riccardi, Giuseppe
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 526 - 539
  • [5] Discriminative Models for Spoken Language Understanding
    Wang, Ye-Yi
    Acero, Alex
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2426 - 2429
  • [6] EEG decoding of spoken words in bilingual listeners: from words to language invariant semantic-conceptual representations
    Correia, Joao M.
    Jansma, Bernadette
    Hausfeld, Lars
    Kikkert, Sanne
    Bonte, Milene
    [J]. FRONTIERS IN PSYCHOLOGY, 2015, 6
  • [7] Generative and Discriminative Algorithms for Spoken Language Understanding
    Raymond, Christian
    Riccardi, Giuseppe
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 413 - 416
  • [8] HIERARCHICAL DISCRIMINATIVE MODEL FOR SPOKEN LANGUAGE UNDERSTANDING
    Svec, Jan
    Smidl, Lubos
    Ircing, Pavel
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8322 - 8326
  • [9] A unified framework for translation and understanding allowing discriminative joint decoding for multilingual speech semantic interpretation
    Jabaian, Bassam
    Lefevre, Fabrice
    Besacier, Laurent
    [J]. COMPUTER SPEECH AND LANGUAGE, 2016, 35 : 185 - 199
  • [10] Practical Semantic Parsing for Spoken Language Understanding
    Damonte, Marco
    Goel, Rahul
    Chung, Tagyoung
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES(NAACL HLT 2019), VOL. 2 (INDUSTRY PAPERS), 2019, : 16 - 23