EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Baby, Deepak [1 ]
Gemmeke, Jort F. [1 ]
Virtanen, Tuomas [2 ]
Van hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, Dept ESAT, Leuven, Belgium
[2] Tampere Univ Technol, Dept Signal Proc, Tampere, Finland
关键词
deep neural networks; non-negative matrix factorisation; coupled dictionaries; speech enhancement; modulation envelope;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural network (DNN) based acoustic modelling has been successfully used for a variety of automatic speech recognition (ASR) tasks, thanks to its ability to learn higher-level information using multiple hidden layers. This paper investigates the recently proposed exemplar-based speech enhancement technique using coupled dictionaries as a pre-processing stage for DNN-based systems. In this setting. the noisy speech is decomposed as a weighted sum of atoms in an input dictionary containing exemplars sampled from a domain of choice. and the resulting weights are applied to a coupled output dictionary containing exemplars sampled in the short-time Fourier transform (STFT) domain to directly obtain the speech and noise estimates for speech enhancement. In this work, settings using input dictionary of exemplars sampled from the STFT, Mel-integrated magnitude STFT and modulation envelope spectra are evaluated. Experiments performed on the AURORA-4 database revealed that these pre-processing stages can improve the performance of the DNN-HMM-based ASR systems with both clean and multi-condition training.
引用
收藏
页码:4485 / 4489
页数:5
相关论文
共 50 条
  • [1] Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition
    Baby, Deepak
    Virtanen, Tuomas
    Gemmeke, Jort F.
    van Hamme, Hugo
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1788 - 1799
  • [2] Exemplar-Based Processing for Speech Recognition
    Sainath, Tara N.
    Ramabhadran, Bhuvana
    Nahamoo, David
    Kanevsky, Dimitri
    Van Compernolle, Dirk
    Demuynck, Kris
    Gemmeke, Jort Florent
    Bellegarda, Jerome R.
    Sundaram, Shiva
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 98 - 113
  • [3] Sparse modeling of neural network posterior probabilities for exemplar-based speech recognition
    Dighe, Pranay
    Asaei, Afsaneh
    Bourlard, Herve
    [J]. SPEECH COMMUNICATION, 2016, 76 : 230 - 244
  • [4] Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Hurmalainen, Antti
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2067 - 2080
  • [5] COUPLED DICTIONARY TRAINING FOR EXEMPLAR-BASED SPEECH ENHANCEMENT
    Baby, Deepak
    Virtanen, Tuomas
    Barker, Tom
    Van Hamme, Hugo
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [6] Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks
    Sainath, Tara N.
    Nahamoo, David
    Kanevsky, Dimitri
    Ramabhadran, Bhuvana
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2127 - 2130
  • [7] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
    You, Yongbin
    Qian, Yanmin
    Yu, Kai
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9
  • [8] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
    Nesbitt, David
    Crookes, Danny
    Ming, Ji
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423
  • [9] EXEMPLAR-BASED NOISE ROBUST AUTOMATIC SPEECH RECOGNITION USING MODULATION SPECTROGRAM FEATURES
    Baby, Deepak
    Virtanen, Tuomas
    Gemmeke, Jort F.
    Barker, Tom
    Van Hamme, Hugo
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 519 - 524
  • [10] Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition
    Kallasjoki, Heikki
    Gemmeke, Jort F.
    Palomaki, Kalle J.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) : 368 - 380