Binaural Deep Neural Network for Noise Robust Automatic Speech Recognition

被引:0
|
作者
Jiang, Yi [1 ]
Zu, Yuan-Yuan [1 ]
机构
[1] Quartermaster Equipment Res Inst, Beijing, Peoples R China
关键词
Deep Neural Network (DNN); Computational Auditory Scene Analysis (CASA); Automatic Speech Recognition (ASR); Ideal Parameter Mask;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Robust automatic speech recognition (ASR) is a challenge task, especially in noisy environments. The difference between the clean training speech model and the noisy speech model is a main factor to reduce the performance of ASR systems. The goal of a robust ASR system is getting the target speech energy distribution, which provides the discriminate information for the acoustic model. We use a binaural deep neural network (DNN) to estimate the energy of the target speech in the mixture through SNR estimation. Then the estimated target speech is used as the input of a convenient ASR system to improve the recognition accuracy. We use the ideal parameter mask as the DNN training goal, and cross entropy as the training cost function. Experiments show the robust ASR performance of the proposed algorithm with various signal to noise ratio conditions.
引用
收藏
页码:512 / 517
页数:6
相关论文
共 50 条
  • [1] A Binaural Deep Neural Networks Parameter Mask for the Robust Automatic Speech Recognition System
    Jiang, Yi
    Liu, Runsheng
    [J]. 2016 INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC), 2016, : 352 - 356
  • [2] Binaural Deep Neural Network for Robust Speech Enhancement
    Jiang, Yi
    Liu, Runsheng
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2014, : 692 - 695
  • [3] Incorporating a Generative Front-end Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition
    Kundu, Souvik
    Sim, Khe Chai
    Gales, Mark
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2359 - 2363
  • [4] Employing Robust Principal Component Analysis for Noise-Robust Speech Feature Extraction in Automatic Speech Recognition with the Structure of a Deep Neural Network
    Hung, Jeih-weih
    Lin, Jung-Shan
    Wu, Po-Jen
    [J]. APPLIED SYSTEM INNOVATION, 2018, 1 (03) : 1 - 14
  • [5] EXPLOITING SYNCHRONY SPECTRA AND DEEP NEURAL NETWORKS FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Ma, Ning
    Marxer, Ricard
    Barker, Jon
    Brown, Guy J.
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 490 - 495
  • [6] AN INVESTIGATION OF DEEP NEURAL NETWORKS FOR NOISE ROBUST SPEECH RECOGNITION
    Seltzer, Michael L.
    Yu, Dong
    Wang, Yongqiang
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7398 - 7402
  • [7] Deep Neural Network Based Speech Separation for Robust Speech Recognition
    Tu Yanhui
    Jun, Du
    Xu Yong
    Dai Lirong
    Chin-Hui, Lee
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 532 - 536
  • [8] JOINT ACOUSTIC FACTOR LEARNING FOR ROBUST DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION
    Kundu, Souvik
    Mantena, Gautam
    Qian, Yanmin
    Tan, Tian
    Delcroix, Marc
    Sim, Khe Chai
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5025 - 5029
  • [9] TOWARDS STRUCTURED DEEP NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    Liao, Yi-Hsiu
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 137 - 144
  • [10] Factored deep convolutional neural networks for noise robust speech recognition
    Fujimoto, Masakiyo
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3837 - 3841