Improvements in recognition of conversational telephone speech

被引:6
|
作者
Peskin, B [1 ]
Newman, M [1 ]
McAllaster, D [1 ]
Nagesha, V [1 ]
Richards, H [1 ]
Wegmann, S [1 ]
Hunt, M [1 ]
Gillick, L [1 ]
机构
[1] Dragon Syst Inc, Newton, MA 02460 USA
关键词
D O I
10.1109/ICASSP.1999.758060
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes recent changes in Dragon's speech recognition system which have markedly improved performance on conversational telephone speech. Key changes include: the conversion to modified PLP-based cepstra from mel-cepstra; the replacement of our usual IMELDA transformation by a new transform using "semi-tied covariance"; a new multi-pass adaptation protocol; probabilities on alternate pronunciations in the lexicon; the addition of word-boundary tags in our acoustic models and the redistribution of model parameters to build fewer output distributions but with more mixture components per model.
引用
收藏
页码:53 / 56
页数:4
相关论文
共 50 条
  • [1] Conversational telephone speech recognition
    Gauvain, JL
    Lamel, L
    Schwenk, H
    Adda, G
    Chen, L
    Lefèvre, F
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 212 - 215
  • [2] Conversational telephone speech recognition for Lithuanian
    Lileiyte, Rasa
    Lamel, Lori
    Guvain, Jean-Luc
    Gorin, Arseniy
    [J]. COMPUTER SPEECH AND LANGUAGE, 2018, 49 : 71 - 82
  • [3] Improving English Conversational Telephone Speech Recognition
    Medennikov, Ivan
    Prudnikov, Alexey
    Zatvornitskiy, Alexander
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2 - 6
  • [4] Progress on Mandarin conversational telephone speech recognition
    Hwang, MY
    Lei, X
    Ng, T
    Bulyko, I
    Ostendorf, M
    Stolcke, A
    Wang, W
    Zheng, J
    Gadde, VRR
    Graciarena, M
    Siu, MH
    Huang, Y
    [J]. 2004 International Symposium on Chinese Spoken Language Processing, Proceedings, 2004, : 1 - 4
  • [5] Noise-Robust speech recognition of Conversational Telephone Speech
    Chen, Gang
    Tolba, Hesham
    O'Shaughnessy, Douglas
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
  • [6] Recognition of conversational telephone speech using the JANUS speech engine
    Zeppenfeld, T
    Finke, M
    Ries, K
    Westphal, M
    Waibel, A
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1815 - 1818
  • [7] Trapping conversational speech: Extending trap/tandem approaches to conversational telephone speech recognition
    Morgan, N
    Chen, BY
    Zhu, QF
    Stolcke, A
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 537 - 540
  • [8] English Conversational Telephone Speech Recognition by Humans and Machines
    Saon, George
    Kurata, Gakuto
    Sercu, Tom
    Audhkhasi, Kartik
    Thomas, Samuel
    Dimitriadis, Dimitrios
    Cui, Xiaodong
    Ramabhadran, Bhuvana
    Picheny, Michael
    Lim, Lynn-Li
    Roomi, Bergul
    Hall, Phil
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 132 - 136
  • [9] Generating and evaluating segmentations for automatic speech recognition of conversational telephone speech
    Tranter, SE
    Yu, K
    Evermann, G
    Woodland, RC
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 753 - 756
  • [10] The IBM 2016 English Conversational Telephone Speech Recognition System
    Saon, George
    Sercu, Tom
    Rennie, Steven
    Kuo, Hong-Kwang J.
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 7 - 11