Ginisupport vector machines for segmental minimum Bayes risk decoding of continuous speech

被引:5
|
作者
Venkataramani, Veera
Chakrabartty, Shantanu
Byrne, William
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
[2] Fair Isaac Corp, San Diego, CA 92130 USA
[3] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
来源
COMPUTER SPEECH AND LANGUAGE | 2007年 / 21卷 / 03期
关键词
D O I
10.1016/j.csl.2006.08.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe the use of support vector machines (SVMs) for continuous speech recognition by incorporating them in segmental minimum Bayes risk decoding. Lattice cutting is used to convert the Automatic Speech Recognition search space into sequences of smaller recognition problems. SVMs are then trained as discriminative models over each of these problems and used in a rescoring framework. We pose the estimation of a posterior distribution over hypotheses in these regions of acoustic confusion as a logistic regression problem. We also show that GiniSVMs can be used as an approximation technique to estimate the parameters of the logistic regression problem. On a small vocabulary recognition task we show that the use of GiniSVMs can improve the performance of a well trained hidden Markov model system trained under the Maximum Mutual Information criterion. We also find that it is possible to derive reliable confidence scores over the GiniSVM hypotheses and that these can be used to good effect in hypothesis combination. We discuss the problems that we expect to encounter in extending this approach to large vocabulary continuous speech recognition and describe initial investigation of constrained estimation techniques to derive feature spaces for SVMs. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:423 / 442
页数:20
相关论文
共 27 条
  • [1] Support vector machines for Segmental Minimum Bayes Risk decoding of continuous speech
    Venkataramani, V
    Chakrabartty, S
    Byrne, W
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 13 - 18
  • [2] Segmental minimum Bayes-Risk decoding for automatic speech recognition
    Goel, V
    Kumar, S
    Byrne, W
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (03): : 234 - 249
  • [3] Discriminative training for segmental Minimum Bayes Risk decoding
    Doumpiotis, V
    Tsakalidis, S
    Byrne, W
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 136 - 139
  • [4] Minimum Bayes risk estimation and decoding in large vocabulary continuous speech recognition
    Byrne, W
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 900 - 907
  • [5] Segmental minimum Bayes-risk decoding for automatic speech recognition (vol 12, pg 287, 2004)
    Goel, V
    Kumar, S
    Byrne, W
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 356 - 357
  • [6] Minimum Bayes-Risk decoding with presumed word significance for speech based information retrieval
    Shichiri, Takashi
    Nanjo, Hiroaki
    Yoshimi, Takehiko
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1557 - 1560
  • [7] Landmark-Guided Segmental Speech Decoding for Continuous Mandarin Speech Recognition
    Chao, Hao
    Song, Cheng
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2016, 12 (03): : 410 - 421
  • [8] Minimum Bayes-risk decoding for statistical machine translation
    Kumar, S
    Byrne, W
    [J]. HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2004, : 169 - 176
  • [9] Minimum Bayes error feature selection for continuous speech recognition
    Saon, G
    Padmanabhan, M
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 800 - 806
  • [10] Lattice segmentation and minimum Bayes risk discriminative training for large vocabulary continuous speech recognition
    Doumpiotis, V
    Byrne, W
    [J]. SPEECH COMMUNICATION, 2006, 48 (02) : 142 - 160