A Reduced Complexity MFCC-based Deep Neural Network Approach for Speech Enhancement

被引:0
|
作者
Razani, Ryan [1 ]
Chung, Hanwook [1 ]
Attabi, Yazid [1 ]
Champagne, Benoit [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, 3480 Univ St, Montreal, PQ, Canada
关键词
Speech enhancement; deep learning; neural networks; low-complexity; MFCC; NOISE;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper focuses on a regression-based deep neural network (DNN) approach for single-channel speech enhancement. While DNN can lead to improved speech quality compared to classical approaches, it is afflicted by high computational complexity in the training stage. The main contribution of this work is to reduce the DNN complexity by introducing a spectral feature mapping from noisy mel frequency cepstral coefficients (MFCC) to enhanced short-time Fourier transform (STFT) spectrum. This approach requires much fewer input features and consequently lead to reduced DNN complexity. Exploiting the frequency domain speech features obtained from this mapping also avoids the information loss in reconstructing the speech signal back to time domain from its MFCC. Compared to the STFT-based DNN approach, the complexity of our approach for the training phase is reduced by a factor of 4.75. Moreover, experimental results of perceptual evaluation of speech quality (PESQ) and source-to-distortion ratio (SDR) show that the proposed approach outperforms the benchmark algorithms and this for various noise types, and different SNR levels.
引用
收藏
页码:331 / 336
页数:6
相关论文
共 50 条
  • [1] MFCC-based deep convolutional neural network for audio depression recognition
    Wang, Yafan
    Lu, Xiaoyong
    Shi, Daimin
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 162 - 166
  • [2] MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech
    Rejaibi, Emna
    Komaty, Ali
    Meriaudeau, Fabrice
    Agrebi, Said
    Othmani, Alice
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 71
  • [3] A MFCC-based CELP speech coder for server-based speech recognition in network environments
    Yoon, Jae Sam
    Lee, Gil Ho
    Kim, Hong Kook
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2007, E90A (03) : 626 - 632
  • [4] A Perceptually Motivated Approach for Speech Enhancement Based on Deep Neural Network
    Han, Wei
    Zhang, Xiongwei
    Min, Gang
    Sun, Meng
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2016, E99A (04): : 835 - 838
  • [5] ASERNet: Automatic speech emotion recognition system using MFCC-based LPC approach with deep learning CNN
    Jagadeeshwar, Kalyanapu
    Sreenivasarao, T.
    Pulicherla, Padmaja
    Satyanarayana, K. N. V.
    Lakshmi, K. Mohana
    Kumar, Pala Mahesh
    INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2023, 14 (04)
  • [6] Speech emotion recognition using MFCC-based entropy feature
    Siba Prasad Mishra
    Pankaj Warule
    Suman Deb
    Signal, Image and Video Processing, 2024, 18 : 153 - 161
  • [7] Speech Enhancement based on Deep Convolutional Neural Network
    Nuthakki, Ramesh
    Masanta, Payel
    Yukta, T. N.
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 770 - 775
  • [8] Supervised speech enhancement based on deep neural network
    Saleem, Nasir
    Khattak, Muhammad Irfan
    Qazi, Abdul Baser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (04) : 5187 - 5201
  • [9] Speech emotion recognition using MFCC-based entropy feature
    Mishra, Siba Prasad
    Warule, Pankaj
    Deb, Suman
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 153 - 161
  • [10] SPEECH RECONSTRUCTION FOR MFCC-BASED LOW BIT-RATE SPEECH CODING
    Jiang Wenbin
    Ying Rendong
    Liu Peilin
    2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2014,