Time-Reversal Enhancement Network With Cross-Domain Information for Noise-Robust Speech Recognition

被引:0
|
作者
Chao, Fu-An [1 ]
Hung, Jeih-Weih [3 ]
Sheu, Tommy [4 ]
Chen, Berlin [2 ]
机构
[1] Natl Taiwan Normal Univ, Taipei 11677, Taiwan
[2] Natl Taiwan Normal Univ, Comp Sci & Informat Engn Dept, Taipei 11677, Taiwan
[3] Natl Chi Nan Univ, Dept Elect Engn, Puli 54516, Taiwan
[4] Delta Elect Inc, Delta Management Syst DMS Dept, Taipei 11491, Taiwan
关键词
Feature extraction; Convolutional neural networks; Spectrogram; Noise measurement; Speech enhancement; Estimation; Time-domain analysis;
D O I
10.1109/MMUL.2021.3139302
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the enormous progress in deep learning, speech enhancement (SE) techniques have shown promising efficacy and play a pivotal role prior to an automatic speech recognition (ASR) system to mitigate the noise effects. In this article, we put forward a novel cross-domain time-reversal enhancement network (CD-TENET). CD-TENET leverages the time-reversed version of a speech signal and two effective features that consider the phase information of a speech signal in the time domain and the frequency domain, respectively, to promote SE performance for noise-robust ASR. Extensive experiments demonstrate that CD-TENET can not only recover the original speech effectively but also improve both SE and ASR performance simultaneously. More surprisingly, the proposed CD-TENET method can offer a marked relative word error rate reduction on test utterances of scenarios contaminated with unseen noises when compared to a strong baseline with the multicondition training setting.
引用
收藏
页码:114 / 124
页数:11
相关论文
共 50 条
  • [1] TENET: A TIME-REVERSAL ENHANCEMENT NETWORK FOR NOISE-ROBUST ASR
    Chao, Fu-An
    Jiang, Shao-Wei Fan
    Yan, Bi-Cheng
    Hung, Jeih-weih
    Chen, Berlin
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 55 - 61
  • [2] NaCL: noise-robust cross-domain contrastive learning for unsupervised domain adaptation
    Jingzheng Li
    Hailong Sun
    Machine Learning, 2023, 112 : 3473 - 3496
  • [3] NaCL: noise-robust cross-domain contrastive learning for unsupervised domain adaptation
    Li, Jingzheng
    Sun, Hailong
    MACHINE LEARNING, 2023, 112 (09) : 3473 - 3496
  • [4] Noise-Robust Speech Recognition Based on RBF Neural Network
    Hou, Xuemei
    HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 413 - 418
  • [5] Probabilistic vector mapping with trajectory information for noise-robust speech recognition
    Kim, DY
    Un, CK
    ELECTRONICS LETTERS, 1996, 32 (17) : 1550 - 1551
  • [6] Noise-Robust speech recognition of Conversational Telephone Speech
    Chen, Gang
    Tolba, Hesham
    O'Shaughnessy, Douglas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
  • [7] A neural network approach for speech enhancement and noise-robust bandwidth extension
    Hao, Xiang
    Xu, Chenglin
    Zhang, Chen
    Xie, Lei
    COMPUTER SPEECH AND LANGUAGE, 2025, 89
  • [8] An overview of noise-robust automatic speech recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE Transactions on Audio, Speech and Language Processing, 2014, 22 (04): : 745 - 777
  • [9] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
    Wang, Yiping
    Zhao, Zhefeng
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397
  • [10] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777