Research on Speech Enhancement Algorithm of Multiresolution Cochleagram Based on Skip Connection Deep Neural Network

被引:4
|
作者
Lan, Chaofeng [1 ]
Wang, YuQiao [1 ]
Zhang, Lei [2 ]
Liu, Chundong [1 ]
Lin, Xiaojia [1 ]
机构
[1] Harbin Univ Sci & Technol, Coll Measurement & Commun Engn, Harbin 150080, Peoples R China
[2] Beidahuang Ind Grp Gen Hosp, Harbin 150088, Peoples R China
关键词
MASK;
D O I
10.1155/2022/5208372
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The speech enhancement effect of traditional deep learning algorithms is not ideal under low signal-to-noise ratios (SNR). Skip connections-deep neural network (Skip-DNN) improves the traditional deep neural network (DNN) by adding skip connections between each layer of the neural network to solve the degradation problem of DNN. In this paper, the Multiresolution Cochleagram (MRCG) features in the gammachirp transform domain are denoised to obtain the improved MRCG (I-MRCG). The noise reduction method adopts the Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator (MMSE-STSA) and takes I-MRCG as the input feature and Skip-DNN as the training network to improve the speech enhancement effect of the model. This paper also proposes an improved source-to-distortion ratio (SDR) loss function. When the loss function uses the improved SDR, it will improve the performance of Skip-DNN speech enhancement model. The experiments in this paper are performed on the Edinburgh dataset. When using I-MRCG as the input feature of Skip-DNN, the average perceptual evaluation of speech quality (PESQ) is 2.9137, and the average short-time objective intelligibility (STOI) is 0.8515. Compared with MRCG as Skip-DNN input features, the improvements are 0.91% and 0.71%, respectively. When the improved SDR is used as the loss function of the speech model, the average PESQ is 2.9699 and the average STOI is 0.8547. Compared with other loss functions, the improved SDR has a better enhancement effect when used as the loss function of the speech enhancement model.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Improved Sparse NMF based Speech Enhancement Method with Deep Neural Network
    Zou, Xia
    Zhang, Xiongwei
    Shi, Wenhua
    Wang, Fupeng
    Zhang, Jingtao
    Gao, Mingyue
    PROCEEDINGS OF THE 2ND INTERNATIONAL FORUM ON MANAGEMENT, EDUCATION AND INFORMATION TECHNOLOGY APPLICATION (IFMEITA 2017), 2017, 130 : 231 - 234
  • [32] A Novel Adversarial Training Scheme for Deep Neural Network based Speech Enhancement
    Cornell, Samuele
    Principi, Emanuele
    Squartini, Stefano
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [33] GLOBAL VARIANCE EQUALIZATION FOR IMPROVING DEEP NEURAL NETWORK BASED SPEECH ENHANCEMENT
    Xu, Yong
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 71 - 75
  • [34] Deep neural network based speech enhancement using mono channel mask
    Ingale, Pallavi P.
    Nalbalwar, Sanjay L.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 841 - 850
  • [35] Effect of spectrogram resolution on deep-neural-network-based speech enhancement
    Takeuchi, Daiki
    Yatabe, Kohei
    Koizumi, Yuma
    Oikawa, Yasuhiro
    Harada, Noboru
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2020, 41 (05) : 769 - 775
  • [36] Research on Image Hiding Algorithm Based on Deep Neural Network
    Beijing University of Posts and Telecommunications, School of Communication Engineering, Beijing
    100876, China
    Proc SPIE Int Soc Opt Eng, 1600,
  • [37] A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing
    Qiu-yu Zhang
    Yu-zhou Li
    Ying-jie Hu
    Multimedia Tools and Applications, 2021, 80 : 1201 - 1221
  • [38] A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing
    Zhang, Qiu-yu
    Li, Yu-zhou
    Hu, Ying-jie
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (01) : 1201 - 1221
  • [39] Deep Convolutional Neural Network-based Speech Signal Enhancement Using Extensive Speech Features
    Garg, Anil
    Sahu, O. P.
    INTERNATIONAL JOURNAL OF COMPUTATIONAL METHODS, 2022, 19 (08)
  • [40] Convolutional Deep Neural Network and Full Connectivity for Speech Enhancement
    Alameri, Ban M.
    Kadhim, Inas Jawad
    Hadi, Suha Qasim
    Hassoon, Ali F.
    Abd, Mustafa M.
    Premaratne, Prashan
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (04) : 140 - 154