Binaural Deep Neural Network for Noise Robust Automatic Speech Recognition

被引：0

作者：

Jiang, Yi ^{[1
]}

Zu, Yuan-Yuan ^{[1
]}

机构：

[1] Quartermaster Equipment Res Inst, Beijing, Peoples R China

来源：

INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND AUTOMATION (ICCEA 2014) | 2014年

关键词：

Deep Neural Network (DNN); Computational Auditory Scene Analysis (CASA); Automatic Speech Recognition (ASR); Ideal Parameter Mask;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Robust automatic speech recognition (ASR) is a challenge task, especially in noisy environments. The difference between the clean training speech model and the noisy speech model is a main factor to reduce the performance of ASR systems. The goal of a robust ASR system is getting the target speech energy distribution, which provides the discriminate information for the acoustic model. We use a binaural deep neural network (DNN) to estimate the energy of the target speech in the mixture through SNR estimation. Then the estimated target speech is used as the input of a convenient ASR system to improve the recognition accuracy. We use the ideal parameter mask as the DNN training goal, and cross entropy as the training cost function. Experiments show the robust ASR performance of the proposed algorithm with various signal to noise ratio conditions.

引用

页码：512 / 517

页数：6

共 50 条

[21] Environmental Noise Analysis for Robust Automatic Speech Recognition
Kishore, N. Sai Bala
Venkata, M. Rao
Nagamani, M.
ADVANCED COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY, 2015, 315
[22] An Overview of Noise-Robust Automatic Speech Recognition
Li, Jinyu
Deng, Li
Gong, Yifan
Haeb-Umbach, Reinhold
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
[23] Noise Adaptive Training for Robust Automatic Speech Recognition
Kalinli, Ozlem
Seltzer, Michael L.
Droppo, Jasha
Acero, Alex
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (08): : 1889 - 1901
[24] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
Wang, Yiping
Zhao, Zhefeng
ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397
[25] Robust automatic speech recognition in impulsive noise environment
Ding, P
Cao, ZG
CHINESE JOURNAL OF ELECTRONICS, 2005, 14 (01): : 165 - 168
[26] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
Rehr, Robert
Gerkmann, Timo
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
[27] Noise-robust automatic speech recognition using a predictive echo state network
Skowronski, Mark D.
Harris, John G.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
[28] Noise-robust automatic speech recognition using a discriminative echo state network
Skowronski, Mark D.
Harris, John G.
2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 1771 - 1774
[29] EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION
Baby, Deepak
Gemmeke, Jort F.
Virtanen, Tuomas
Van hamme, Hugo
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4485 - 4489
[30] Automatic Speech Recognition with Deep Neural Networks for Impaired Speech
Espana-Bonet, Cristina
Fonollosa, Jose A. R.
ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, IBERSPEECH 2016, 2016, 10077 : 97 - 107

← 1 2 3 4 5 →