Recent improvements of the RWTH large vocabulary speech recognition system on spontaneous speech

被引:0
|
作者
Sixtus, A [1 ]
Molau, S [1 ]
Kanthak, S [1 ]
Schlüter, R [1 ]
Ney, H [1 ]
机构
[1] RWTH Aachen Univ Technol, Lehrstuhl Informat VI, D-52056 Aachen, Germany
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents recent improvements of the RWTH large vocabulary continuous speech recognition system (LSCSR). In particular, we will report on the integration of across-word models into the first recognition pass, and describe better algorithms for fast vocal tract normalization (VTN). We will focus both on the improvements in word error rate and how to speed up the recognizer with only minimal loss in recognition accuracy. Implementation details and experimental results are given for the VerbMobil task, a German spontaneous speech corpus. The 25.0% word error rate (WER) of our within-word baseline system was reduced to 21.4% with VTN and across-word models. Decreasing the real-time factor (RTF) by up to 85% resulted in only a small degradation in recognition performance of 2% relative on average.
引用
收藏
页码:1671 / 1674
页数:4
相关论文
共 50 条
  • [41] The BBN Byblos 1997 Large Vocabulary conversational Speech Recognition system
    Zavaliagkos, G
    McDonough, J
    Miller, D
    El-Jaroudi, A
    Billa, J
    Richardson, F
    Ma, K
    Siu, M
    Gish, H
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 905 - 908
  • [42] The 2001 BYBLOS english large vocabulary conversational speech recognition system
    Matsoukas, S
    Colthurst, T
    Kimball, O
    Solomonoff, A
    Richardson, F
    Quillen, C
    Gish, H
    Dognin, P
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 721 - 724
  • [43] RAPID BOOTSTRAPPING OF A UKRAINIAN LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SYSTEM
    Schlippe, Tim
    Volovyk, Mykola
    Yurchenko, Kateryna
    Schultz, Tanja
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7329 - 7333
  • [44] Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition
    Masumura, Ryo
    Hahm, Seongjun
    Ito, Akinori
    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, : 1465 - 1468
  • [45] Improvements on Speech Recognition for Fast Speech
    Lee, Ki-Seung
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2006, 25 (02): : 88 - 95
  • [46] Training a language model using webdata for large vocabulary Japanese spontaneous speech recognition
    Masumura, Ryo
    Hahm, Seongjun
    Ito, Akinori
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1476 - 1479
  • [47] JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
    Itou, Katunobu
    Yamamoto, Mikio
    Takeda, Kazuya
    Takezawa, Toshiyuki
    Matsuoka, Tatsuo
    Kobayashi, Tetsunori
    Shikano, Kiyohiro
    Itahashi, Shuichi
    Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 1999, 20 (03): : 199 - 206
  • [48] Recent progress in spontaneous speech recognition and understanding
    Furui, S
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 2002, : 253 - 258
  • [49] Large vocabulary audio-visual speech recognition using the Janus speech recognition toolkit
    Kratt, J
    Metze, F
    Stiefelhagen, R
    Waibel, A
    PATTERN RECOGNITION, 2004, 3175 : 488 - 495
  • [50] LARGE VOCABULARY SPEECH RECOGNITION USING SUBWORD UNITS
    LEE, CH
    GAUVAIN, JL
    PIERACCINI, R
    RABINER, LR
    SPEECH COMMUNICATION, 1993, 13 (3-4) : 263 - 279