Deep Neural Network Based Speech Separation for Robust Speech Recognition

被引:0
|
作者
Tu Yanhui [1 ]
Jun, Du [1 ]
Xu Yong [1 ]
Dai Lirong [1 ]
Chin-Hui, Lee [2 ]
机构
[1] Univ Sci & Technol China, Shanghai, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
single-channel speech separation; robust speech recognition; deep neural networks; semi-supervised mode; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a novel deep neural network (DNN) architecture is proposed to generate the speech features of both the target speaker and interferer for speech separation without using any prior information about the interfering speaker. DNN is adopted here to directly model the highly nonlinear relationship between speech features of the mixed signals and the two competing speakers. Experimental results on a monaural speech separation and recognition challenge task show that the proposed DNN framework enhances the separation performance in terms of different objective measures under the semi-supervised mode where the training data of the target speaker is provided while the unseen interferer in the separation stage is predicted by using multiple interfering speakers mixed with the target speaker in the training stage. Furthermore, as a preprocessing step in the testing stage for robust speech recognition, our speech separation approach can achieve significant improvements of the recognition accuracy over the baseline system with no source separation.
引用
收藏
页码:532 / 536
页数:5
相关论文
共 50 条
  • [31] DEEP RECURRENT REGULARIZATION NEURAL NETWORK FOR SPEECH RECOGNITION
    Chien, Jen-Tzung
    Lu, Tsai-Wei
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4560 - 4564
  • [32] Speech Separation based on Deep Belief Network
    Wu Haijia
    Zhang Xiongwei
    Zhang Liangliang
    Zou Xia
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL INDUSTRIAL INFORMATICS AND COMPUTER ENGINEERING CONFERENCE, 2015, : 1486 - 1493
  • [33] VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Qian, Yanmin
    Woodland, Philip C.
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 481 - 488
  • [34] AN INVESTIGATION OF DEEP NEURAL NETWORKS FOR NOISE ROBUST SPEECH RECOGNITION
    Seltzer, Michael L.
    Yu, Dong
    Wang, Yongqiang
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7398 - 7402
  • [35] Deep Q-network-based noise suppression for robust speech recognition
    Park, Tae-Jun
    Chang, Joon-Hyuk
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (05) : 2362 - 2373
  • [36] Deep Q-network-based noise suppression for robust speech recognition
    Park T.-J.
    Chang J.-H.
    [J]. Turkish Journal of Electrical Engineering and Computer Sciences, 2021, 25 (09) : 2362 - 2373
  • [37] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
    Wang, Yiping
    Zhao, Zhefeng
    [J]. ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397
  • [38] Speech Enhancement Method Based On LSTM Neural Network for Speech Recognition
    Liu, Ming
    Wang, Yujun
    Wang, Jin
    Wang, Jing
    Xie, Xiang
    [J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 245 - 249
  • [39] Neural Network Adaptive Beamforming for Robust Multichannel Speech Recognition
    Li, Bo
    Sainath, Tara N.
    Weiss, Ron J.
    Wilson, Kevin W.
    Bacchiani, Michiel
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1976 - 1980
  • [40] GEOMETRIC INFORMATION BASED MONAURAL SPEECH SEPARATION USING DEEP NEURAL NETWORK
    Xian, Yang
    Sun, Yang
    Chambers, Jonathon A.
    Naqvi, Syed Mohsen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4454 - 4458