Robust automatic speech recognition based on neural network in reverberant environments

被引:0
|
作者
Bai, L. [1 ]
Li, H. L. [1 ]
He, Y. Y. [1 ]
机构
[1] Coordinat Ctr China, Natl Comp Network Emergency Response Tech Team, Beijing, Peoples R China
关键词
BOTTLE-NECK FEATURES;
D O I
暂无
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The reverberant environment is still a big challenge to speech recognition. This paper presents a method of reverberant Automatic Speech Recognition (ASR) using front-end based methods and enhanced Voice Activity Detection (VAD). A 2-channel dereverberation method is adopted to achieve robust dereverberation under different reverberant conditions. Also a 2-channel spectral enhancement method is used where the gain of each frequency bin is controlled by acoustic scene, which is detected based on the analysis of full-band coherent property. We also use Deep Neural Network (DNN) as a feature extractor, and a DNN based VAD is also used to improve the ASR performance. The DNN based front-end allows a very flexible integration of meta-information. Bottle neck features are extracted in place of MFCC features used in the HMM-GMM system. Finally, we evaluate our methods on the data provided by REVERB challenge. On simulated data, the performance yields more than 33% relative reduction in Word Error Rate (WER).
引用
收藏
页码:1319 / 1324
页数:6
相关论文
共 50 条
  • [21] Robust Automatic Speech Recognition for Accented Mandarin in Car Environments
    Pei Ding
    Lei He
    Xiang Yan
    Jie Hao
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2542 - 2545
  • [22] A robust feature extraction for automatic speech recognition in noisy environments
    Lima, C
    Almeida, LB
    Monteiro, JL
    [J]. 2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 540 - 543
  • [23] THE AUTOMATIC SPEECH RECOGITION IN REVERBERANT ENVIRONMENTS (ASpIRE) CHALLENGE
    Harper, Mary
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 547 - 554
  • [24] A hybrid neural network based speech recognition system for pervasive environments
    Sehgal, MSB
    Gondal, I
    Dooley, L
    [J]. INMIC 2004: 8th International Multitopic Conference, Proceedings, 2004, : 309 - 314
  • [25] A STUDY ON DATA AUGMENTATION OF REVERBERANT SPEECH FOR ROBUST SPEECH RECOGNITION
    Ko, Tom
    Peddinti, Vijayaditya
    Povey, Daniel
    Seltzer, Michael L.
    Khudanpur, Sanjeev
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5220 - 5224
  • [26] EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION
    Baby, Deepak
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Van hamme, Hugo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4485 - 4489
  • [27] A robust endpoint detection of speech for noisy environments with application to automatic speech recognition
    Bou-Ghazale, SE
    Assaleh, K
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 3808 - 3811
  • [28] Noise-Robust Speech Recognition Based on RBF Neural Network
    Hou, Xuemei
    [J]. HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418
  • [29] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
    You, Yongbin
    Qian, Yanmin
    Yu, Kai
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9
  • [30] Acoustic diversity for improved speech recognition in reverberant environments
    Gillespie, BW
    Atlas, LE
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 557 - 560