EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Baby, Deepak [1 ]
Gemmeke, Jort F. [1 ]
Virtanen, Tuomas [2 ]
Van hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, Dept ESAT, Leuven, Belgium
[2] Tampere Univ Technol, Dept Signal Proc, Tampere, Finland
关键词
deep neural networks; non-negative matrix factorisation; coupled dictionaries; speech enhancement; modulation envelope;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural network (DNN) based acoustic modelling has been successfully used for a variety of automatic speech recognition (ASR) tasks, thanks to its ability to learn higher-level information using multiple hidden layers. This paper investigates the recently proposed exemplar-based speech enhancement technique using coupled dictionaries as a pre-processing stage for DNN-based systems. In this setting. the noisy speech is decomposed as a weighted sum of atoms in an input dictionary containing exemplars sampled from a domain of choice. and the resulting weights are applied to a coupled output dictionary containing exemplars sampled in the short-time Fourier transform (STFT) domain to directly obtain the speech and noise estimates for speech enhancement. In this work, settings using input dictionary of exemplars sampled from the STFT, Mel-integrated magnitude STFT and modulation envelope spectra are evaluated. Experiments performed on the AURORA-4 database revealed that these pre-processing stages can improve the performance of the DNN-HMM-based ASR systems with both clean and multi-condition training.
引用
收藏
页码:4485 / 4489
页数:5
相关论文
共 50 条
  • [21] Speech Emotion Recognition Based on Deep Neural Network
    Zhu, Zijiang
    Hu, Yi
    Li, Junshan
    Li, Jianjun
    Wang, Junhua
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 154 - 154
  • [22] Deep Neural Network-based Speech Separation Combining with MVDR Beamformer for Automatic Speech Recognition System
    Lee, Bong-Ki
    Jeong, Jaewoong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2019,
  • [23] Noise Robust Exemplar Matching for Speech Enhancement: Applications to Automatic Speech Recognition
    Yilmaz, Emre
    Baby, Deepak
    Van Hannne, Hugo
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 688 - 692
  • [24] Integrated exemplar-based template matching and statistical modeling for continuous speech recognition
    Xie Sun
    Yunxin Zhao
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [25] Integrated exemplar-based template matching and statistical modeling for continuous speech recognition
    Sun, Xie
    Zhao, Yunxin
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [26] An optimization method for speech enhancement based on deep neural network
    Sun, Haixia
    Li, Sikun
    [J]. 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN ENERGY, ENVIRONMENT AND CHEMICAL ENGINEERING, 2017, 69
  • [27] Speech enhancement based on noise classification and deep neural network
    Wang, Wenbo
    Liu, Houguang
    Yang, Jianhua
    Cao, Guohua
    Hua, Chunli
    [J]. MODERN PHYSICS LETTERS B, 2019, 33 (17):
  • [28] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda, Shashidhar
    Patilkulkarni, Sudarshan
    Ravi, Vinayakumar
    H.L., Gururaj
    Krichen, Moez
    [J]. Data Science and Management, 2024, 7 (01): : 25 - 34
  • [29] Deep Convolution Neural Network Based Speech Recognition for Chhattisgarhi
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    Tekchandani, Hitesh
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 667 - 671
  • [30] TOWARDS STRUCTURED DEEP NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    Liao, Yi-Hsiu
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 137 - 144