NOISE ROBUST SPEECH RECOGNITION ON AURORA4 BY HUMANS AND MACHINES

被引:0
|
作者
Qian, Yanmin [1 ,2 ]
Tan, Tian [1 ]
Hu, Hu [1 ]
Liu, Qi [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Tencent, Tencent AI Lab, Bellevue, WA 98004 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
robust speech recognition; very deep convolution residual network; cluster adaptive training; future-vector;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Although great progress has been made in automatic speech recognition (ASR), significant performance degradation still exists in noisy environments. Based on our previous introduced very deep CNNs, this paper further integrates residual learning to evaluate very deep convolutional residual network (VDCRN) in noisy conditions, which shows more powerful robustness. Then, cluster adaptive training (CAT) is developed on the VDCRN to reduce the mismatch between the training and testing in noisy scenarios. Moreover, the advanced future-vector assisted LSTM-RNN LM is proposed to achieve a further gain. All the proposed approaches are evaluated on Aurora4 and show a significant improvement for each technology. The final system achieves 3.09% WER on Aurora4, which is approaching humans' performance on this task. This is a new milestone for noise-robust ASR on this benchmark.
引用
收藏
页码:5604 / 5608
页数:5
相关论文
共 50 条
  • [21] Assessing costa rican children speech recognition by humans and machines
    Morales-Rodriguez, Maribel
    Coto-Jimenez, Marvin
    TECNOLOGIA EN MARCHA, 2022, 35
  • [22] SYNTHESIS AND RECOGNITION OF SPEECH - VOICE COMMUNICATION BETWEEN HUMANS AND MACHINES
    FLANAGAN, JL
    IEEE TRANSACTIONS ON SONICS AND ULTRASONICS, 1982, 29 (03): : 158 - 158
  • [23] A Study of Additive Noise Model for Robust Speech Recognition
    Awatade, Manisha H.
    2ND INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN SCIENCE AND TECHNOLOGY (ICM2ST-11), 2011, 1414
  • [24] Noise robust automatic speech recognition: review and analysis
    Dua M.
    Akanksha
    Dua S.
    International Journal of Speech Technology, 2023, 26 (02) : 475 - 519
  • [25] Extended VTS for Noise-Robust Speech Recognition
    van Dalen, Rogier C.
    Gales, Mark J. F.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743
  • [26] A novel channel estimate for noise robust speech recognition
    Vanderreydt, Geoffroy
    Demuynck, Kris
    COMPUTER SPEECH AND LANGUAGE, 2024, 86
  • [27] NOISE AWARE MANIFOLD LEARNING FOR ROBUST SPEECH RECOGNITION
    Tomar, Vikrant Singh
    Rose, Richard C.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7087 - 7091
  • [28] Noise robust speech recognition with state duration constraints
    Laurila, K
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 871 - 874
  • [29] Robust automatic speech recognition in the presence of impulsive noise
    Potamitis, I
    Fakotakis, N
    Kokkinakis, G
    ELECTRONICS LETTERS, 2001, 37 (12) : 799 - 800
  • [30] An overview of noise-robust automatic speech recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777