Research on Speech Enhancement Algorithm of Multiresolution Cochleagram Based on Skip Connection Deep Neural Network

被引:4
|
作者
Lan, Chaofeng [1 ]
Wang, YuQiao [1 ]
Zhang, Lei [2 ]
Liu, Chundong [1 ]
Lin, Xiaojia [1 ]
机构
[1] Harbin Univ Sci & Technol, Coll Measurement & Commun Engn, Harbin 150080, Peoples R China
[2] Beidahuang Ind Grp Gen Hosp, Harbin 150088, Peoples R China
关键词
MASK;
D O I
10.1155/2022/5208372
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The speech enhancement effect of traditional deep learning algorithms is not ideal under low signal-to-noise ratios (SNR). Skip connections-deep neural network (Skip-DNN) improves the traditional deep neural network (DNN) by adding skip connections between each layer of the neural network to solve the degradation problem of DNN. In this paper, the Multiresolution Cochleagram (MRCG) features in the gammachirp transform domain are denoised to obtain the improved MRCG (I-MRCG). The noise reduction method adopts the Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator (MMSE-STSA) and takes I-MRCG as the input feature and Skip-DNN as the training network to improve the speech enhancement effect of the model. This paper also proposes an improved source-to-distortion ratio (SDR) loss function. When the loss function uses the improved SDR, it will improve the performance of Skip-DNN speech enhancement model. The experiments in this paper are performed on the Edinburgh dataset. When using I-MRCG as the input feature of Skip-DNN, the average perceptual evaluation of speech quality (PESQ) is 2.9137, and the average short-time objective intelligibility (STOI) is 0.8515. Compared with MRCG as Skip-DNN input features, the improvements are 0.91% and 0.71%, respectively. When the improved SDR is used as the loss function of the speech model, the average PESQ is 2.9699 and the average STOI is 0.8547. Compared with other loss functions, the improved SDR has a better enhancement effect when used as the loss function of the speech enhancement model.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] CASA: A Convolution Accelerator using Skip Algorithm for Deep Neural Network
    Kim, Young Ho
    An, Gi Jo
    Sunwoo, Myung Hoon
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [22] Binaural Deep Neural Network for Robust Speech Enhancement
    Jiang, Yi
    Liu, Runsheng
    2014 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2014, : 692 - 695
  • [23] The Application of Deep Neural Network in Speech Enhancement Processing
    Chen Jian-ming
    Liang Zhi-cheng
    2018 5TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2018), 2018, : 1263 - 1266
  • [24] Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems
    Kolbaek, Morten
    Tan, Zheng-Hua
    Jensen, Jesper
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 153 - 167
  • [25] Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments
    Gao, Tian
    Du, Jun
    Xu, Yong
    Liu, Cong
    Dai, Li-Rong
    Lee, Chin-Hui
    LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, LVA/ICA 2015, 2015, 9237 : 75 - 82
  • [26] A Deep Neural Network Based Kalman Filter for Time Domain Speech Enhancement
    Yu, Hongjiang
    Ouyang, Zhiheng
    Zhu, Wei-Ping
    Champagne, Benoit
    Ji, Yunyun
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [27] Broad Phoneme Class Specific Deep Neural Network Based Speech Enhancement
    Karjol, Pavan
    Ghosh, Prasanta Kumar
    2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM 2018), 2018, : 372 - 376
  • [28] Deep neural network based speech enhancement using mono channel mask
    Pallavi P. Ingale
    Sanjay L. Nalbalwar
    International Journal of Speech Technology, 2019, 22 : 841 - 850
  • [29] Speech enhancement method based on the perceptual joint optimization deep neural network
    Yuan W.
    Lou Y.
    Liang C.
    Wang Z.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (02): : 90 - 94
  • [30] SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement
    Gao, Tian
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3713 - 3717