An RNN-based channel classification for mandarin speech recognition over GSM/PSTN transmission environments

被引:0
|
作者
Hong, WT [1 ]
机构
[1] Ind Technol Res Inst, Adv Technol Ctr, CCL, Hsinchu, Taiwan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper is concerned with adopting an RNN (Recurrent Neural Network)-based channel classification technique for improving the robustness of speech recognition over GSM (Global System for Mobile Communication) and PSTN (Public Switched Telephone Network) transmission channels. We apply the RNN-based channel classification to select a most likely HMM from pre-trained HMMs that are trained for each specific channel environment. A broad-class discrimination is incorporated into the RNN-based channel classification by rejecting the disturbed frames of testing speech for improving the performance. By applying the proposed technique we obtained a drop on the average word error rate by about 24% for the recognition of the abbreviated Taiwan stock names over the conventional HMM-based scheme. Experimental results show it is an efficient framework to enhance the robustness across different channel environments.
引用
收藏
页码:1033 / 1036
页数:4
相关论文
共 29 条
  • [1] An overview of RNN-based Mandarin speech recognition approaches
    Liao, YF
    Hong, WT
    Wang, WJ
    Wang, YR
    Chen, SH
    [J]. JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 1999, 22 (05) : 535 - 547
  • [2] A modular RNN-based method for continuous Mandarin speech recognition
    Liao, YF
    Chen, SH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 252 - 263
  • [3] An RNN-based preclassification method for fast continuous Mandarin speech recognition
    Chen, SH
    Liao, YF
    Chiang, SM
    Chang, SG
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 86 - 90
  • [4] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
    Chen, SH
    Hwang, SH
    Wang, YR
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
  • [5] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
    EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
    不详
    [J]. Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
  • [6] RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion
    Wang, WJ
    Liao, YF
    Chen, SH
    [J]. SPEECH COMMUNICATION, 2002, 36 (3-4) : 247 - 265
  • [7] An RNN-based noise estimation and likelihood compensation for noisy speech recognition
    Hong, WT
    Chen, SH
    [J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 293 - 301
  • [8] A GMM-based telephone channel classification for Mandarin speech recognition
    Xu, W
    Peng, X
    Wang, BX
    [J]. 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 642 - 645
  • [9] On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model
    Soutner, Daniel
    Zelinka, Jan
    Mueller, Ludek
    [J]. SPEECH AND COMPUTER, 2014, 8773 : 315 - 321
  • [10] Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application
    Dutta, Krishna
    Sarma, Kandarpa Kumar
    [J]. PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, DEVICES AND INTELLIGENT SYSTEMS (CODLS), 2012, : 600 - 603