An overview of RNN-based Mandarin speech recognition approaches

被引:3
|
作者
Liao, YF [1 ]
Hong, WT [1 ]
Wang, WJ [1 ]
Wang, YR [1 ]
Chen, SH [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Commun Engn, Hsinchu 300, Taiwan
关键词
RNN; Mandarin speech recognition; HMM;
D O I
10.1080/02533839.1999.9670492
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The ANN-based approach is an alternate to the HMM method which is the currently dominant technology in the field of speech recognition. This paper is an overview of the RNN-based approaches to Mandarin speech recognition. Some RNN-based approaches proposed previously: syllable-boundary pre-segmentation, broad-class pre-classification, prosodic phrase boundary detection, and isolated and continuous syllable recognitions are discussed. We find from the survey that these approaches are all effective. So the RNN technology is comparable to the conventional HMM method on Mandarin speech recognition.
引用
收藏
页码:535 / 547
页数:13
相关论文
共 50 条
  • [1] A modular RNN-based method for continuous Mandarin speech recognition
    Liao, YF
    Chen, SH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (03): : 252 - 263
  • [2] An RNN-based preclassification method for fast continuous Mandarin speech recognition
    Chen, SH
    Liao, YF
    Chiang, SM
    Chang, SG
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (01): : 86 - 90
  • [3] An RNN-based channel classification for mandarin speech recognition over GSM/PSTN transmission environments
    Hong, WT
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1033 - 1036
  • [4] An RNN-based prosodic information synthesizer for Mandarin text-to-speech
    Chen, SH
    Hwang, SH
    Wang, YR
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (03): : 226 - 239
  • [5] AN RNN-BASED SPECTRAL INFORMATIONG ENERATION FOR MANDARIN TEXT-TO-SPEECH
    EOOO/CCL, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan
    不详
    [J]. Eur. Conf. Speech Commun. Technol., EUROSPEECH, 1600, (549-552):
  • [6] RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion
    Wang, WJ
    Liao, YF
    Chen, SH
    [J]. SPEECH COMMUNICATION, 2002, 36 (3-4) : 247 - 265
  • [7] An RNN-based noise estimation and likelihood compensation for noisy speech recognition
    Hong, WT
    Chen, SH
    [J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VI, 1996, : 293 - 301
  • [8] On a Hybrid NN/HMM Speech Recognition System with a RNN-Based Language Model
    Soutner, Daniel
    Zelinka, Jan
    Mueller, Ludek
    [J]. SPEECH AND COMPUTER, 2014, 8773 : 315 - 321
  • [9] Optimization of RNN-Based Speech Activity Detection
    Gelly, Gregory
    Gauvain, Jean-Luc
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (03) : 646 - 656
  • [10] Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application
    Dutta, Krishna
    Sarma, Kandarpa Kumar
    [J]. PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, DEVICES AND INTELLIGENT SYSTEMS (CODLS), 2012, : 600 - 603