Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition

被引:27
|
作者
Baby, Deepak [1 ]
Virtanen, Tuomas [2 ]
Gemmeke, Jort F. [1 ]
van Hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, Speech Proc Res Grp, Elect Engn Dept ESAT, B-3000 Leuven, Belgium
[2] Tampere Univ Technol, Dept Signal Proc, FI-33101 Tampere, Finland
关键词
Exemplar-based; modulation envelope; noise robust automatic speech recognition; non-negative sparse coding; MODULATION; NOISE;
D O I
10.1109/TASLP.2015.2450491
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Exemplar-based speech enhancement systems work by decomposing the noisy speech as a weighted sum of speech and noise exemplars stored in a dictionary and use the resulting speech and noise estimates to obtain a time-varying filter in the full-resolution frequency domain to enhance the noisy speech. To obtain the decomposition, exemplars sampled in lower dimensional spaces are preferred over the full-resolution frequency domain for their reduced computational complexity and the ability to better generalize to unseen cases. But the resulting filter may be sub-optimal as the mapping of the obtained speech and noise estimates to the full-resolution frequency domain yields a low-rank approximation. This paper proposes an efficient way to directly compute the full-resolution frequency estimates of speech and noise using coupled dictionaries: an input dictionary containing atoms from the desired exemplar space to obtain the decomposition and a coupled output dictionary containing exemplars from the full-resolution frequency domain. We also introduce modulation spectrogram features for the exemplar-based tasks using this approach. The proposed system was evaluated for various choices of input exemplars and yielded improved speech enhancement performances on the AURORA-2 and AURORA-4 databases. We further show that the proposed approach also results in improved word error rates (WERs) for the speech recognition tasks using HMM-GMM and deep-neural network (DNN) based systems.
引用
收藏
页码:1788 / 1799
页数:12
相关论文
共 50 条
  • [1] EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION
    Baby, Deepak
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Van hamme, Hugo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4485 - 4489
  • [2] COUPLED DICTIONARY TRAINING FOR EXEMPLAR-BASED SPEECH ENHANCEMENT
    Baby, Deepak
    Virtanen, Tuomas
    Barker, Tom
    Van Hamme, Hugo
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] HYBRID INPUT SPACES FOR EXEMPLAR-BASED NOISE ROBUST SPEECH RECOGNITION USING COUPLED DICTIONARIES
    Baby, Deepak
    Van Hamme, Hugo
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1676 - 1680
  • [4] Exemplar-Based Processing for Speech Recognition
    Sainath, Tara N.
    Ramabhadran, Bhuvana
    Nahamoo, David
    Kanevsky, Dimitri
    Van Compernolle, Dirk
    Demuynck, Kris
    Gemmeke, Jort Florent
    Bellegarda, Jerome R.
    Sundaram, Shiva
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 98 - 113
  • [5] Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
    Gemmeke, Jort F.
    Virtanen, Tuomas
    Hurmalainen, Antti
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2067 - 2080
  • [6] Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks
    Sainath, Tara N.
    Nahamoo, David
    Kanevsky, Dimitri
    Ramabhadran, Bhuvana
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2127 - 2130
  • [7] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
    Nesbitt, David
    Crookes, Danny
    Ming, Ji
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423
  • [8] Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition
    Kallasjoki, Heikki
    Gemmeke, Jort F.
    Palomaki, Kalle J.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) : 368 - 380
  • [9] EXEMPLAR-BASED NOISE ROBUST AUTOMATIC SPEECH RECOGNITION USING MODULATION SPECTROGRAM FEATURES
    Baby, Deepak
    Virtanen, Tuomas
    Gemmeke, Jort F.
    Barker, Tom
    Van Hamme, Hugo
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 519 - 524
  • [10] Estimating uncertainty to improve exemplar-based feature enhancement for noise robust speech recognition
    [J]. 1600, Institute of Electrical and Electronics Engineers Inc., United States (22):