An RNN-based channel classification for mandarin speech recognition over GSM/PSTN transmission environments

被引：0

作者：

Hong, WT ^{[1
]}

机构：

[1] Ind Technol Res Inst, Adv Technol Ctr, CCL, Hsinchu, Taiwan

来源：

2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS | 2002年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper is concerned with adopting an RNN (Recurrent Neural Network)-based channel classification technique for improving the robustness of speech recognition over GSM (Global System for Mobile Communication) and PSTN (Public Switched Telephone Network) transmission channels. We apply the RNN-based channel classification to select a most likely HMM from pre-trained HMMs that are trained for each specific channel environment. A broad-class discrimination is incorporated into the RNN-based channel classification by rejecting the disturbed frames of testing speech for improving the performance. By applying the proposed technique we obtained a drop on the average word error rate by about 24% for the recognition of the abbreviated Taiwan stock names over the conventional HMM-based scheme. Experimental results show it is an efficient framework to enhance the robustness across different channel environments.

引用

页码：1033 / 1036

页数：4

共 29 条

[1] An overview of RNN-based Mandarin speech recognition approaches
Liao, YF
Hong, WT
Wang, WJ
Wang, YR
Chen, SH
[J]. JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 1999, 22 (05) : 535 - 547
[2] A modular RNN-based method for continuous Mandarin speech recognition
Liao, YF
Chen, SH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 252 - 263
[3] An RNN-based preclassification method for fast continuous Mandarin speech recognition
Chen, SH
Liao, YF
Chiang, SM
Chang, SG
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 86 - 90
[4] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
Chen, SH
Hwang, SH
Wang, YR
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
[5] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
不详
[J]. Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
[6] RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion
Wang, WJ
Liao, YF
Chen, SH
[J]. SPEECH COMMUNICATION, 2002, 36 (3-4) : 247 - 265
[7] An RNN-based noise estimation and likelihood compensation for noisy speech recognition
Hong, WT
Chen, SH
[J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 293 - 301
[8] A GMM-based telephone channel classification for Mandarin speech recognition
Xu, W
Peng, X
Wang, BX
[J]. 2004 7TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS 1-3, 2004, : 642 - 645
[9] On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model
Soutner, Daniel
Zelinka, Jan
Mueller, Ludek
[J]. SPEECH AND COMPUTER, 2014, 8773 : 315 - 321
[10] Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application
Dutta, Krishna
Sarma, Kandarpa Kumar
[J]. PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, DEVICES AND INTELLIGENT SYSTEMS (CODLS), 2012, : 600 - 603

← 1 2 3 →