EXPLORING DEEP NEURAL NETWORKS AND DEEP AUTOENCODERS IN REVERBERANT SPEECH RECOGNITION

被引:0
|
作者
Mimura, Masato [1 ]
Sakai, Shinsuke [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Acad Ctr Comp & Media Studies, Sakyo Ku, Kyoto 6068501, Japan
关键词
reverberant speech recognition; Deep Neural Networks (DNN); Deep Autoencoder (DAE); ALGORITHM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose an approach to reverberant speech recognition adopting deep learning in front end as well as back end of the system. At the front end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognition is performed using a DNN-HMM acoustic models at the back end. The system was evaluated on simulated and real reverberant speech data sets. On average, the DNN-HMM system trained on the multi-condition training data outperformed the MLLR-adapted GMM-HMM system trained on the same data. The feature enhancement with the DAE contributed to the improvement of recognition accuracy especially in more adverse conditions. We also performed an unsupervised adaptation of the DNN-HMM models to the test data enhanced by the DAE and achieved improvements in word accuracies in all reverberation conditions of the test data.
引用
收藏
页码:197 / 201
页数:5
相关论文
共 50 条
  • [1] Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature
    Masato Mimura
    Shinsuke Sakai
    Tatsuya Kawahara
    [J]. EURASIP Journal on Advances in Signal Processing, 2015
  • [2] Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2015,
  • [3] SPEECH FEATURE DENOISING AND DEREVERBERATION VIA DEEP AUTOENCODERS FOR NOISY REVERBERANT SPEECH RECOGNITION
    Feng, Xue
    Zhang, Yaodong
    Glass, James
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] DEEP AUTOENCODERS AUGMENTED WITH PHONE-CLASS FEATURE FOR REVERBERANT SPEECH RECOGNITION
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4365 - 4369
  • [5] Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement
    Gutierrez-Munoz, Michelle
    Gonzalez-Salazar, Astryd
    Coto-Jimenez, Marvin
    [J]. BIOMIMETICS, 2020, 5 (01)
  • [6] Binaural reverberant Speech separation based on deep neural networks
    Zhang, Xueliang
    Wang, DeLiang
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2018 - 2022
  • [7] A Performance Evaluation of Several Deep Neural Networks for Reverberant Speech Separation
    Liu, Qingju
    Wang, Wenwu
    Jackson, Philip J. B.
    Safavi, Saeid
    [J]. 2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, : 689 - 693
  • [8] Binaural Classification for Reverberant Speech Segregation Using Deep Neural Networks
    Jiang, Yi
    Wang, DeLiang
    Liu, RunSheng
    Feng, ZhenMing
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (12) : 2112 - 2121
  • [9] Deep Segmental Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Deng, Li
    Yu, Dong
    Jiang, Hui
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1848 - 1852
  • [10] DEEP MAXOUT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Cai, Meng
    Shi, Yongzhe
    Liu, Jia
    [J]. 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 291 - 296