SPEECH ENHANCEMENT USING MULTIPLE DEEP NEURAL NETWORKS

被引:0
|
作者
Karjol, Pavan [1 ]
Kumar, Ajay M. [2 ]
Ghosh, Prasanta Kumar [1 ]
机构
[1] Indian Inst Sci, Elect Engn, Bengaluru 560012, India
[2] NIT K, Elect & Commun Engn, Surathkal 575025, India
关键词
Deep neural networks; speech enhancement; gating network; ESTIMATORS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this work, we present a variant of multiple deep neural network (DNN) based speech enhancement method. We directly estimate clean speech spectrum as a weighted average of outputs from multiple DNNs. The weights are provided by a gating network. The multiple DNNs and the gating network are trained jointly. The objective function is set as the mean square logarithmic error between the target clean spectrum and the estimated spectrum. We conduct experiments using two and four DNNs using the TIMIT corpus with nine noise types (four seen noises and five unseen noises) taken from the AURORA database at four different signal-to-noise ratios (SNRs). We also compare the proposed method with a single DNN based speech enhancement scheme and existing multiple DNN schemes using segmental SNR, perceptual evaluation of speech quality (PESQ) and short-term objective intelligibility (STOI) as the evaluation metrics. These comparisons show the superiority of proposed method over baseline schemes in both seen and unseen noises. Specifically, we observe an absolute improvement of 0.07 and 0.04 in PESQ measure compared to single DNN when averaged over all noises and SNRs for seen and unseen noise cases respectively.
引用
收藏
页码:5049 / 5053
页数:5
相关论文
共 50 条
  • [1] Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks
    Kumar, Anurag
    Florencio, Dinei
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3738 - 3742
  • [2] PERCEPTUALLY GUIDED SPEECH ENHANCEMENT USING DEEP NEURAL NETWORKS
    Zhao, Yan
    Xu, Buye
    Giri, Ritwik
    Zhang, Tao
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5074 - 5078
  • [3] Audio-Visual Speech Enhancement using Deep Neural Networks
    Hou, Jen-Cheng
    Wang, Syu-Siang
    Lai, Ying-Hui
    Lin, Jen-Chun
    Tsao, Yu
    Chang, Hsiu-Wen
    Wang, Hsin-Min
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [4] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
    Tkachenko, Maxim
    Yamshinin, Alexander
    Lyubimov, Nikolay
    Kotov, Mikhail
    Nastasenko, Marina
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
  • [5] Ideal neighbourhood mask for speech enhancement using deep neural networks
    Arcos, Christian
    Vellasco, Marley
    Alcaim, Abraham
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [6] Speech Enhancement With Deep Neural Networks Using MoG Based Labels
    Hammer, Hodaya
    Rath, Gilad
    Chazan, Shlomo E.
    Goldberger, Jacob
    Gannot, Sharon
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING IN ISRAEL (ICSEE), 2018,
  • [7] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT SPEECH ENHANCEMENT
    Tan, Ke
    Wang, DeLiang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8358 - 8362
  • [8] Regularized sparse features for noisy speech enhancement using deep neural networks
    Khattak, Muhammad Irfan
    Saleem, Nasir
    Gao, Jiechao
    Verdu, Elena
    Fuente, Javier Parra
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 100
  • [9] Reverberated Speech Enhancement Using Neural Networks
    Dufera, Bisrat Derebssa
    Shimamura, Tetsuya
    [J]. 2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2009), 2009, : 441 - 444
  • [10] An Experimental Study on Speech Enhancement Based on Deep Neural Networks
    Xu, Yong
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (01) : 65 - 68