ROBUST SPEECH RECOGNITION FROM RATIO MASKS

被引:0
|
作者
Wang, Zhong-Qiu [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
Robust ASR; Ideal Ratio Mask; Ideal Binary Mask; CNN; DNN; NOISE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Robustness against noise is crucial for automatic speech recognition systems in real-world environments. In this paper, we propose a novel approach that performs robust ASR by directly recognizing ratio masks. In the proposed approach, a deep neural network (DNN) is first trained to estimate the ideal ratio mask (IRM) from a noisy utterance and then a convolutional neural network (CNN) is employed to recognize estimated IRMs. The proposed approach has been evaluated on the TIDigits corpus, and the results demonstrate that direct recognition of ratio masks outperforms direct recognition of binary masks and traditional MMSE-HMM based method for robust ASR.
引用
下载
收藏
页码:5720 / 5724
页数:5
相关论文
共 50 条
  • [21] IDEAL RATIO MASK ESTIMATION USING DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Narayanan, Arun
    Wang, DeLiang
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7092 - 7096
  • [22] Robust Beam forming for Speech Recognition Using DNN-Based Time-Frequency Masks Estimation
    Jiang, Wenbin
    Wen, Fei
    Liu, Peilin
    IEEE ACCESS, 2018, 6 : 52385 - 52392
  • [23] Histogram equalization of speech representation for robust speech recognition
    de la Torre, A
    Peinado, AM
    Segura, JC
    Pérez-Córdoba, JL
    Benítez, MC
    Rubio, AJ
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 355 - 366
  • [24] Normalization of the Speech Modulation Spectra for Robust Speech Recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08): : 1662 - 1674
  • [25] Robust distributed speech recognition using speech enhancement
    Flynn, Ronan
    Jones, Edward
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2008, 54 (03) : 1267 - 1273
  • [26] CASA Based Speech Separation for Robust Speech Recognition
    Han Runqiang
    Zhao Pei
    Gao Qin
    Zhang Zhiping
    Wu Hao
    Wu Xihong
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 77 - 80
  • [27] Compensation of speech enhancement distortion for robust speech recognition
    Ding, P
    Cao, ZG
    2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 449 - 452
  • [28] SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION IN MOTORCYCLE ENVIRONMENT
    Mporas, Iosif
    Ganchev, Todor
    Kocsis, Otilia
    Fakotakis, Nikos
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2010, 19 (02) : 159 - 173
  • [29] Speech/music discrimination for robust speech recognition in robots
    Choi, Mu Yeol
    Song, Hwa Jeon
    Kim, Hyung Soon
    2007 RO-MAN: 16TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1-3, 2007, : 118 - +
  • [30] Normalizing the speech modulation spectrum for robust speech recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1021 - +