ROBUST SPEECH RECOGNITION FROM RATIO MASKS

被引:0
|
作者
Wang, Zhong-Qiu [1 ]
Wang, DeLiang [1 ,2 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[2] Ohio State Univ, Ctr Cognit & Brain Sci, Columbus, OH 43210 USA
关键词
Robust ASR; Ideal Ratio Mask; Ideal Binary Mask; CNN; DNN; NOISE;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Robustness against noise is crucial for automatic speech recognition systems in real-world environments. In this paper, we propose a novel approach that performs robust ASR by directly recognizing ratio masks. In the proposed approach, a deep neural network (DNN) is first trained to estimate the ideal ratio mask (IRM) from a noisy utterance and then a convolutional neural network (CNN) is employed to recognize estimated IRMs. The proposed approach has been evaluated on the TIDigits corpus, and the results demonstrate that direct recognition of ratio masks outperforms direct recognition of binary masks and traditional MMSE-HMM based method for robust ASR.
引用
下载
收藏
页码:5720 / 5724
页数:5
相关论文
共 50 条
  • [41] Pitch restoration for robust speech recognition
    Lima, C
    Tavares, A
    Silva, C
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANAGUAGE, PROCEEDINGS, 2003, 2721 : 18 - 22
  • [42] Robust speech recognition in car environments
    Shozakai, M
    Nakamura, S
    Shikano, K
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 269 - 272
  • [43] Adaptive compensation for robust speech recognition
    Lee, CH
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 357 - 364
  • [44] Robust Mizo Continuous Speech Recognition
    Dey, Abhishek
    Sarma, Biswajit Dev
    Lalhminghlui, Wendy
    Ngente, Lalnunsiami
    Gogoi, Parismita
    Sarmah, Priyankoo
    Prasanna, S. R. M.
    Sinha, Rohit
    Nirmala, S. R.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1036 - 1040
  • [45] Robust speech recognition with dynamic synapses
    Liaw, JS
    Berger, TW
    IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2175 - 2179
  • [46] Trajectory Modeling for Robust Speech Recognition
    Sim, KheChai
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : XXVII - XXVIII
  • [47] Stochastic Matching for Robust Speech Recognition
    Sankar, Ananth
    Lee, Chin-Hui
    IEEE SIGNAL PROCESSING LETTERS, 1994, 1 (08) : 124 - 125
  • [48] Toward robust speech recognition and understanding
    Furui, S
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 41 (03): : 245 - 254
  • [49] Subband correlation and robust speech recognition
    McAuley, J
    Ming, J
    Stewart, D
    Hanna, P
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 956 - 964
  • [50] ULTRASONIC SENSING FOR ROBUST SPEECH RECOGNITION
    Srinivasan, Sundararajan
    Raj, Bhiksha
    Ezzat, Tony
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5102 - 5105