A NEW COST FUNCTION FOR DNN-BASED SPEECH ENHANCEMENT COMBINING NMF AND CASA

被引:0
|
作者
Yan, Bofang [1 ]
Bao, Changchun [1 ]
Bai, Zhigang [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Speech & Audio Signal Proc Lab, Beijing 100124, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep neural network; cost function; nonnegative matrix factorization; computational auditory scene analysis; speech enhancement; MONAURAL SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a novel deep neural network (DNN) training approach is proposed for speech enhancement based on nonnegative matrix factorization (NMF) and computational auditory scene analysis (CASA). Considering a higher correlation of NMF algorithm along the frequency bins for the time-varying signals and a high noise making effect of CASA, we propose a new cost function for DNN training, which consists of the ideal ratio mask (IRM) and NMF based Wiener-like filter. Extensive experiments are carried out to verify the performance of the proposed method. Moreover, we compare the performance of the developed algorithm with traditional NMF approach, NMF-based linear minimum mean square error (LMMSE) filter approach and CASA method. Our results demonstrate that the proposed approach improved speech quality greatly.
引用
收藏
页码:255 / 259
页数:5
相关论文
共 50 条
  • [41] DNN-BASED SPEECH PRESENCE PROBABILITY ESTIMATION FORMULTI-FRAME SINGLE-MICROPHONE SPEECH ENHANCEMENT
    Tammen, Marvin
    Fischer, Doerte
    Meyer, Bernd T.
    Doclo, Simon
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 191 - 195
  • [42] A study of speaker adaptation for DNN-based speech synthesis
    Wu, Zhizheng
    Swietojanski, Pawel
    Veaux, Christophe
    Renals, Steve
    King, Simon
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 879 - 883
  • [43] DNN-based speech watermarking resistant to desynchronization attacks
    Pavlovic, Kosta
    Kovacevic, Slavko
    Djurovic, Igor
    Wojciechowski, Adam
    [J]. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2023, 21 (05)
  • [44] COMBINING SPARSE NMF WITH DEEP NEURAL NETWORK: A NEW CLASSIFICATION-BASED APPROACH FOR SPEECH ENHANCEMENT
    Tseng, Hung-Wei
    Hong, Mingyi
    Luo, Zhi-Quan
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2145 - 2149
  • [45] DNN-based Ultrasound-to-Speech Conversion for a Silent Speech Interface
    Csapo, Temas Gabor
    Grosz, Tamas
    Gosztolya, Gabor
    Toth, Laszlo
    Marko, Alexandra
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3672 - 3676
  • [46] DNN-based Feature Enhancement using Joint Training Framework for Robust Multichannel Speech Recognition
    Lee, Kang Hyun
    Kang, Tae Gyoon
    Kang, Woo Hyun
    Kim, Nam Soo
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3027 - 3031
  • [47] AN OBJECTIVE EVALUATION OF HEARING AIDS AND DNN-BASED BINAURAL SPEECH ENHANCEMENT IN COMPLEX ACOUSTIC SCENES
    Guso, Enric
    Luberadzka, Joanna
    Baig, Marti
    Sayin, Umut
    Serra, Xavier
    [J]. 2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [48] DNN-Based Feature Enhancement Using DOA-Constrained ICA for Robust Speech Recognition
    Lee, Ho-Yong
    Cho, Ji-Won
    Kim, Minook
    Park, Hyung-Min
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (08) : 1091 - 1095
  • [49] DNN-Based Speech Enhancement Using Soft Audible Noise Masking for Wind Noise Reduction
    Haichuan Bai
    Fengpei Ge
    Yonghong Yan
    [J]. China Communications, 2018, 15 (09) : 235 - 243
  • [50] DNN-based feature enhancement using joint training framework for robust multichannel speech recognition
    Lee, Kang Hyun
    Kang, Tae Gyoon
    Kang, Woo Hyun
    Kim, Nam Soo
    [J]. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2016, 08-12-September-2016 : 3027 - 3031