The 1998 HTK system for transcription of conversational telephone speech

被引:26
|
作者
Hain, T [1 ]
Woodland, PC [1 ]
Niesler, TR [1 ]
Whittaker, EWD [1 ]
机构
[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England
关键词
D O I
10.1109/ICASSP.1999.758061
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the 1998 HTK large vocabulary speech recognition system for conversational telephone speech as used in the NIST 1998 Hub5E evaluation. Front-end and language modelling experiments conducted using various training and test sets from both the Switchboard and Callhome English corpora are presented. Our complete system includes reduced bandwidth analysis, side-based cepstral feature normalisation, vocal tract length normalisation (VTLN), triphone and quinphone hidden Markov models (HMMs) built using speaker adaptive training (SAT), maximum likelihood linear regression (MLLR) speaker adaptation and a confidence score based system combination. A detailed description of the complete system together with experimental results for each stage of our multi-pass decoding scheme is presented. The word error rate obtained is almost 20% better than our 1997 system on the development set.
引用
收藏
页码:57 / 60
页数:4
相关论文
共 50 条
  • [1] 1998 HTK system for transcription of conversational telephone speech
    Hain, T.
    Woodland, P.C.
    Niesler, T.R.
    Whittaker, E.W.D.
    [J]. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 1 : 57 - 60
  • [2] Development of the 2003 CU-HTK Conversational Telephone Speech transcription system
    Evermann, G
    Chan, HY
    Gales, MJF
    Hain, T
    Liu, X
    Mrva, D
    Wang, L
    Woodland, P
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 249 - 252
  • [3] New features in the CU-HTK system for transcription of conversational telephone speech
    Hain, T
    Woodland, PC
    Evermann, G
    Povey, D
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 57 - 60
  • [4] Automatic transcription of conversational telephone speech
    Hain, T
    Woodland, PC
    Evermann, G
    Gales, MJF
    Liu, XY
    Moore, GL
    Povey, D
    Wang, L
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1173 - 1185
  • [5] Development of the CUHTK 2004 Mandarin conversational telephone speech transcription system
    Gales, MJF
    Jia, B
    Liu, X
    Sim, KC
    Woodland, P
    Yu, K
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 841 - 844
  • [6] Conversational telephone speech recognition
    Gauvain, JL
    Lamel, L
    Schwenk, H
    Adda, G
    Chen, L
    Lefèvre, F
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 212 - 215
  • [7] Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system
    Matsoukas, Spyros
    Gauvain, Jean-Luc
    Adda, Gilles
    Colthurst, Thomas
    Kao, Chia-Lin
    Kimball, Owen
    Lamel, Lori
    Lefevre, Fabrice
    Ma, Jeff Z.
    Makhoul, John
    Nguyen, Long
    Prasad, Rohit
    Schwartz, Richard
    Schwenk, Holger
    Xiang, Bing
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (05): : 1541 - 1556
  • [8] The IBM 2016 English Conversational Telephone Speech Recognition System
    Saon, George
    Sercu, Tom
    Rennie, Steven
    Kuo, Hong-Kwang J.
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 7 - 11
  • [9] The IBM 2015 English Conversational Telephone Speech Recognition System
    Saon, George
    Kuo, Hong-Kwang J.
    Rennie, Steven
    Picheny, Michael
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3140 - 3144
  • [10] Improvements in recognition of conversational telephone speech
    Peskin, B
    Newman, M
    McAllaster, D
    Nagesha, V
    Richards, H
    Wegmann, S
    Hunt, M
    Gillick, L
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 53 - 56