Multi-resolution auditory cepstral coefficient and adaptive mask for speech enhancement with deep neural network

被引:6
|
作者
Li, Ruwei [1 ]
Sun, Xiaoyue [1 ]
Liu, Yanan [1 ]
Yang, Dengcai [1 ]
Dong, Liang [2 ]
机构
[1] Beijing Univ Technol, Sch Informat & Commun Engn, Fac Informat Technol, Beijing Key Lab Computat Intelligence & Intellige, Beijing, Peoples R China
[2] Baylor Univ, Elect & Comp Engn, Waco, TX 76798 USA
基金
中国国家自然科学基金;
关键词
Speech enhancement; Deep neural network; Multi-resolution auditory cepstral coefficient; Adaptive mask; NOISE;
D O I
10.1186/s13634-019-0618-4
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The performance of the existing speech enhancement algorithms is not ideal in low signal-to-noise ratio (SNR) non-stationary noise environments. In order to resolve this problem, a novel speech enhancement algorithm based on multi-feature and adaptive mask with deep learning is presented in this paper. First, we construct a new feature called multi-resolution auditory cepstral coefficient (MRACC). This feature which is extracted from four cochleagrams of different resolutions can capture the local information and spectrotemporal context and reduce the algorithm complexity. Second, an adaptive mask (AM) which can track noise change for speech enhancement is put forward. The AM can flexibly combine the advantages of an ideal binary mask (IBM) and an ideal ratio mask (IRM) with the change of SNR. Third, a deep neural network (DNN) architecture is used as a nonlinear function to estimate adaptive mask. And the first and second derivatives of MRACC and MRACC are used as the input of the DNN. Finally, the estimated AM is used to weight the noisy speech to achieve enhanced speech. Experimental results show that the proposed algorithm not only further improves speech quality and intelligibility, but also suppresses more noise than the contrast algorithms. In addition, the proposed algorithm has a lower complexity than the contrast algorithms.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Speech Intelligibility Based Enhancement System Using Modified Deep Neural Network and Adaptive Multi-band Spectral Subtraction
    Dash, Tusar Kanti
    Solanki, Sandeep Singh
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2020, 111 (02) : 1073 - 1087
  • [42] Hybrid Model of Multi-Resolution Signal Transformation and Deep Neural Network in Power Quality Disturbances Classification
    Chiam, Dar Hung
    Lim, King Hann
    Law, Kah Haw
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2024, 40 (05) : 1031 - 1043
  • [43] A Complex Neural Network Adaptive Beamforming for Multi-channel Speech Enhancement in Time Domain
    Jiang, Tao
    Liu, Hongqing
    Zhou, Yi
    Gan, Lu
    [J]. COMMUNICATIONS AND NETWORKING (CHINACOM 2021), 2022, : 129 - 139
  • [44] Multi-Resolution Supervision Network with an Adaptive Weighted Loss for Desert Segmentation
    Wang, Lexuan
    Weng, Liguo
    Xia, Min
    Liu, Jia
    Lin, Haifeng
    [J]. REMOTE SENSING, 2021, 13 (11)
  • [45] Shear-Wave Particle-Velocity Estimation and Enhancement Using a Multi-Resolution Convolutional Neural Network
    Chen, Xufei
    Chennakeshava, Nishith
    Wildeboer, Rogier
    Mischi, Massimo
    van Sloun, Ruud J. G.
    [J]. ULTRASOUND IN MEDICINE AND BIOLOGY, 2023, 49 (07): : 1518 - 1526
  • [46] An optimization method for speech enhancement based on deep neural network
    Sun, Haixia
    Li, Sikun
    [J]. 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN ENERGY, ENVIRONMENT AND CHEMICAL ENGINEERING, 2017, 69
  • [47] Combination of dynamic features with a new mask to optimize neural network speech enhancement
    Mei, Shulin
    Jia, Hairong
    Wang, Xiaogang
    Wu, Yifeng
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2021, 48 (03): : 91 - 98
  • [48] Convolutional Deep Neural Network and Full Connectivity for Speech Enhancement
    Alameri, Ban M.
    Kadhim, Inas Jawad
    Hadi, Suha Qasim
    Hassoon, Ali F.
    Abd, Mustafa M.
    Premaratne, Prashan
    [J]. INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (04) : 140 - 154
  • [49] Speech enhancement based on noise classification and deep neural network
    Wang, Wenbo
    Liu, Houguang
    Yang, Jianhua
    Cao, Guohua
    Hua, Chunli
    [J]. MODERN PHYSICS LETTERS B, 2019, 33 (17):
  • [50] Speech Enhancement for Optical Laser Microphone With Deep Neural Network
    Cai, Chengkai
    Iwai, Kenta
    Nishiura, Takanobu
    Yamashita, Yoichi
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 449 - 454