EXEMPLAR-BASED SPEECH ENHANCEMENT FOR DEEP NEURAL NETWORK BASED AUTOMATIC SPEECH RECOGNITION

被引：0

作者：

Baby, Deepak ^{[1
]}

Gemmeke, Jort F. ^{[1
]}

Virtanen, Tuomas ^{[2
]}

Van hamme, Hugo ^{[1
]}

机构：

[1] Katholieke Univ Leuven, Dept ESAT, Leuven, Belgium

[2] Tampere Univ Technol, Dept Signal Proc, Tampere, Finland

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年

关键词：

deep neural networks; non-negative matrix factorisation; coupled dictionaries; speech enhancement; modulation envelope;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural network (DNN) based acoustic modelling has been successfully used for a variety of automatic speech recognition (ASR) tasks, thanks to its ability to learn higher-level information using multiple hidden layers. This paper investigates the recently proposed exemplar-based speech enhancement technique using coupled dictionaries as a pre-processing stage for DNN-based systems. In this setting. the noisy speech is decomposed as a weighted sum of atoms in an input dictionary containing exemplars sampled from a domain of choice. and the resulting weights are applied to a coupled output dictionary containing exemplars sampled in the short-time Fourier transform (STFT) domain to directly obtain the speech and noise estimates for speech enhancement. In this work, settings using input dictionary of exemplars sampled from the STFT, Mel-integrated magnitude STFT and modulation envelope spectra are evaluated. Experiments performed on the AURORA-4 database revealed that these pre-processing stages can improve the performance of the DNN-HMM-based ASR systems with both clean and multi-condition training.

引用

页码：4485 / 4489

页数：5

共 50 条

[1] Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition
Baby, Deepak
Virtanen, Tuomas
Gemmeke, Jort F.
van Hamme, Hugo
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1788 - 1799
[2] Exemplar-Based Processing for Speech Recognition
Sainath, Tara N.
Ramabhadran, Bhuvana
Nahamoo, David
Kanevsky, Dimitri
Van Compernolle, Dirk
Demuynck, Kris
Gemmeke, Jort Florent
Bellegarda, Jerome R.
Sundaram, Shiva
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 98 - 113
[3] Sparse modeling of neural network posterior probabilities for exemplar-based speech recognition
Dighe, Pranay
Asaei, Afsaneh
Bourlard, Herve
[J]. SPEECH COMMUNICATION, 2016, 76 : 230 - 244
[4] Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
Gemmeke, Jort F.
Virtanen, Tuomas
Hurmalainen, Antti
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 2067 - 2080
[5] COUPLED DICTIONARY TRAINING FOR EXEMPLAR-BASED SPEECH ENHANCEMENT
Baby, Deepak
Virtanen, Tuomas
Barker, Tom
Van Hamme, Hugo
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[6] Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks
Sainath, Tara N.
Nahamoo, David
Kanevsky, Dimitri
Ramabhadran, Bhuvana
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2127 - 2130
[7] LOCAL TRAJECTORY BASED SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION WITH DEEP NEURAL NETWORK
You, Yongbin
Qian, Yanmin
Yu, Kai
[J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 5 - 9
[8] SPEECH SEGMENT CLUSTERING FOR REAL-TIME EXEMPLAR-BASED SPEECH ENHANCEMENT
Nesbitt, David
Crookes, Danny
Ming, Ji
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5419 - 5423
[9] EXEMPLAR-BASED NOISE ROBUST AUTOMATIC SPEECH RECOGNITION USING MODULATION SPECTROGRAM FEATURES
Baby, Deepak
Virtanen, Tuomas
Gemmeke, Jort F.
Barker, Tom
Van Hamme, Hugo
[J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 519 - 524
[10] Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition
Kallasjoki, Heikki
Gemmeke, Jort F.
Palomaki, Kalle J.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) : 368 - 380

← 1 2 3 4 5 →