An Experimental Study on Speech Enhancement Based on Deep Neural Networks

被引:656
|
作者
Xu, Yong [1 ]
Du, Jun [1 ]
Dai, Li-Rong [1 ]
Lee, Chin-Hui [2 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei 230026, Anhui, Peoples R China
[2] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
Deep neural networks; noise reduction; regression model; speech enhancement;
D O I
10.1109/LSP.2013.2291240
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This letter presents a regression-based speech enhancement framework using deep neural networks (DNNs) with a multiple-layer deep architecture. In the DNN learning process, a large training set ensures a powerful modeling capability to estimate the complicated nonlinear mapping from observed noisy speech to desired clean signals. Acoustic context was found to improve the continuity of speech to be separated from the background noises successfully without the annoying musical artifact commonly observed in conventional speech enhancement algorithms. A series of pilot experiments were conducted under multi-condition training with more than 100 hours of simulated speech data, resulting in a good generalization capability even in mismatched testing conditions. When compared with the logarithmic minimum mean square error approach, the proposed DNN-based algorithm tends to achieve significant improvements in terms of various objective quality measures. Furthermore, in a subjective preference evaluation with 10 listeners, 76.35% of the subjects were found to prefer DNN-based enhanced speech to that obtained with other conventional technique.
引用
收藏
页码:65 / 68
页数:4
相关论文
共 50 条
  • [1] A Regression Approach to Speech Enhancement Based on Deep Neural Networks
    Xu, Yong
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) : 7 - 19
  • [2] Target Speech Signal Enhancement Based on Deep Neural Networks
    Zhang, Xin
    Wang, MingJiang
    Xuan, XiaoGuang
    Sun, FengJiao
    [J]. 2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 241 - 245
  • [3] SPEECH ENHANCEMENT BASED ON DEEP NEURAL NETWORKS WITH SKIP CONNECTIONS
    Tu, Ming
    Zhang, Xianxian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5565 - 5569
  • [4] A Novel Approach to Speech Enhancement Based on Deep Neural Networks
    Salehi, Maryam
    Mirzakuchaki, Sattar
    [J]. ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2022, 22 (02) : 71 - 78
  • [5] An Experimental Study of Speech Emotion Recognition Based on Deep Convolutional Neural Networks
    Zheng, W. Q.
    Yu, J. S.
    Zou, Y. X.
    [J]. 2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2015, : 827 - 831
  • [6] A NEW SPEECH ENHANCEMENT APPROACH BASED ON PROGRESSIVE DEEP NEURAL NETWORKS
    Shu, Xiaofeng
    Zhou, Yi
    Cao, Yin
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 191 - 195
  • [7] Phase-Aware Speech Enhancement Based on Deep Neural Networks
    Zheng, Naijun
    Zhang, Xiao-Lei
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (01) : 63 - 76
  • [8] Speech Enhancement With Deep Neural Networks Using MoG Based Labels
    Hammer, Hodaya
    Rath, Gilad
    Chazan, Shlomo E.
    Goldberger, Jacob
    Gannot, Sharon
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
  • [9] SPEECH ENHANCEMENT USING MULTIPLE DEEP NEURAL NETWORKS
    Karjol, Pavan
    Kumar, Ajay M.
    Ghosh, Prasanta Kumar
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5049 - 5053
  • [10] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT SPEECH ENHANCEMENT
    Tan, Ke
    Wang, DeLiang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8358 - 8362