An overview of RNN-based Mandarin speech recognition approaches

被引：3

作者：

Liao, YF ^{[1
]}

Hong, WT ^{[1
]}

Wang, WJ ^{[1
]}

Wang, YR ^{[1
]}

Chen, SH ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Dept Commun Engn, Hsinchu 300, Taiwan

来源：

JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS | 1999年 / 22卷 / 05期

关键词：

RNN; Mandarin speech recognition; HMM;

D O I：

10.1080/02533839.1999.9670492

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

The ANN-based approach is an alternate to the HMM method which is the currently dominant technology in the field of speech recognition. This paper is an overview of the RNN-based approaches to Mandarin speech recognition. Some RNN-based approaches proposed previously: syllable-boundary pre-segmentation, broad-class pre-classification, prosodic phrase boundary detection, and isolated and continuous syllable recognitions are discussed. We find from the survey that these approaches are all effective. So the RNN technology is comparable to the conventional HMM method on Mandarin speech recognition.

引用

页码：535 / 547

页数：13

共 50 条

[1] A modular RNN-based method for continuous Mandarin speech recognition
Liao, YF
Chen, SH
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 252 - 263
[2] An RNN-based preclassification method for fast continuous Mandarin speech recognition
Chen, SH
Liao, YF
Chiang, SM
Chang, SG
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 86 - 90
[3] An RNN-based channel classification for mandarin speech recognition over GSM/PSTN transmission environments
Hong, WT
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1033 - 1036
[4] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
Chen, SH
Hwang, SH
Wang, YR
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
[5] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
不详
[J]. Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
[6] RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion
Wang, WJ
Liao, YF
Chen, SH
[J]. SPEECH COMMUNICATION, 2002, 36 (3-4) : 247 - 265
[7] An RNN-based noise estimation and likelihood compensation for noisy speech recognition
Hong, WT
Chen, SH
[J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 293 - 301
[8] On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model
Soutner, Daniel
Zelinka, Jan
Mueller, Ludek
[J]. SPEECH AND COMPUTER, 2014, 8773 : 315 - 321
[9] Optimization of RNN-Based Speech Activity Detection
Gelly, Gregory
Gauvain, Jean-Luc
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (03) : 646 - 656
[10] Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application
Dutta, Krishna
Sarma, Kandarpa Kumar
[J]. PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, DEVICES AND INTELLIGENT SYSTEMS (CODLS), 2012, : 600 - 603

← 1 2 3 4 5 →