Noise-robust automatic speech recognition using a discriminative echo state network

被引:7
|
作者
Skowronski, Mark D. [1 ]
Harris, John G. [1 ]
机构
[1] Univ Florida, Comp NeuroEngn Lab Elect & Comp Engn, Gainesville, FL 32611 USA
关键词
D O I
10.1109/ISCAS.2007.378015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The echo state network (ESN) is a recurrent neural network proposed by Herbert Jaeger with a simplified training routine. Previously, we have demonstrated the noise-robust performance of the predictive ESN classifier in automatic speech recognition experiments in which the network was trained to predict the next frame of speech features. Classification performance was limited because the predictive models lacked discriminability, so we changed the model output to a one-of-many output encoding scheme and trained the model discriminatively. Performance was compared to a hidden Markov model (HMM) in small-vocabulary ASR experiments with additive noise. Accuracy of 50% was achieved by a discriminative ESN classifier at -8.5 dB SNR, compared to 0.4 dB SNR for a predictive ESN classifier and 6.6 dB SNR for an HMM. With discriminative training, a larger reservoir was employed for the discriminative ESN classifier compared to the predictive ESN classifier which resulted a larger memory depth and more noise-robust performance.
引用
收藏
页码:1771 / 1774
页数:4
相关论文
共 50 条
  • [1] Noise-robust automatic speech recognition using a predictive echo state network
    Skowronski, Mark D.
    Harris, John G.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
  • [2] An overview of noise-robust automatic speech recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    [J]. IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777
  • [3] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
  • [4] Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition
    Mahkonen, Katariina
    Hurmalainen, Antti
    Virtanen, Tuomas
    Gemmeke, Jort
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 472 - +
  • [5] Factorial Speech Processing Models for Noise-Robust Automatic Speech Recognition
    Khademian, Mahdi
    Homayounpour, Mohammad Mehdi
    [J]. 2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 637 - 642
  • [6] Noise-robust speech recognition by discriminative adaptation in parallel model combination
    Chung, YJ
    [J]. ELECTRONICS LETTERS, 2000, 36 (04) : 370 - 371
  • [7] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Koekueer, Muenevver
    Jancovic, Peter
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932
  • [8] Empirical Mode Decomposition For Noise-Robust Automatic Speech Recognition
    Wu, Kuo-Hao
    Chen, Chia-Ping
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2074 - 2077
  • [9] A companding front end for noise-robust automatic speech recognition
    Guinness, J
    Raj, B
    Schmidt-Nielsen, B
    Turicchia, L
    Sarpeshkar, R
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 249 - 252
  • [10] Noise-Robust Algorithm of Speech Features Extraction for Automatic Speech Recognition System
    Yakhnev, A. N.
    Pisarev, A. S.
    [J]. PROCEEDINGS OF THE XIX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM 2016), 2016, : 206 - 208