SPEECH ENHANCEMENT USING LONG SHORT-TERM MEMORY BASED RECURRENT NEURAL NETWORKS FOR NOISE ROBUST SPEAKER VERIFICATION

被引：0

作者：

Kolbaek, Morten ^{[1
]}

Tan, Zheng-Hua ^{[1
]}

Jensen, Jesper ^{[1
]}

机构：

[1] Aalborg Univ, Dept Elect Syst, Fredrik Bajers Vej 7, DK-9220 Aalborg, Denmark

来源：

2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016) | 2016年

基金：

欧盟地平线“2020”;

关键词：

Speaker Verification; Long Short-Term Memory; Deep Neural Networks; Speech Enhancement; RSR2015; RECOGNITION; MODELS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we propose to use a state-of-the-art Deep Recurrent Neural Network (DRNN) based Speech Enhancement (SE) algorithm for noise robust Speaker Verification (SV). Specifically, we study the performance of an i-vector based SV system, when tested in noisy conditions using a DRNN based SE front-end utilizing a Long Short-Term Memory (LSTM) architecture. We make comparisons to systems using a Non-negative Matrix Factorization (NMF) based front-end, and a Short-Time Spectral Amplitude Minimum Mean Square Error (STSA-MMSE) based front-end, respectively. We show in simulation experiments that a male-speaker and text-independent DRNN based SE front-end, without specific a priori knowledge about the noise type outperforms a text, noise type and speaker dependent NMF based front-end as well as a STSA-MMSE based front-end in terms of Equal Error Rates for a large range of noise types and signal to noise ratios on the RSR2015 speech corpus.

引用

页码：305 / 311

页数：7

共 50 条

[1] On Speaker Adaptation of Long Short-Term Memory Recurrent Neural Networks
Miao, Yajie
Metze, Florian
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1101 - 1105
[2] Long Short-Term Memory Networks for Noise Robust Speech Recognition
Woellmer, Martin
Sun, Yang
Eyben, Florian
Schuller, Bjoern
[J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2966 - 2969
[3] Robust Speech Recognition using Long Short-Term Memory Recurrent Neural Networks for Hybrid Acoustic Modelling
Geiger, Juergen T.
Zhang, Zixing
Weninger, Felix
Schuller, Bjoern
Rigoll, Gerhard
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 631 - 635
[4] LOMBARD SPEECH SYNTHESIS USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS
Bollepalli, Bajibabu
Airaksinen, Manu
Alku, Paavo
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5505 - 5509
[5] Detecting Overlapping Speech with Long Short-Term Memory Recurrent Neural Networks
Geiger, Juergen T.
Eyben, Florian
Schuller, Bjoern
Rigoll, Gerhard
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1667 - 1671
[6] Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks
Chen, Zhuo
Watanabe, Shinji
Erdogan, Hakan
Hershey, John R.
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3274 - 3278
[7] Session Based Recommendations Using Recurrent Neural Networks - Long Short-Term Memory
Dobrovolny, Michal
Selamat, Ali
Krejcar, Ondrej
[J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 53 - 65
[8] Modeling Speaker Variability Using Long Short-Term Memory Networks for Speech Recognition
Li, Xiangang
Wu, Xihong
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1086 - 1090
[9] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
Parcollet, Titouan
Morchid, Mohamed
Linares, Georges
De Mori, Renato
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
[10] Long Short-Term Memory based Convolutional Recurrent Neural Networks for Large Vocabulary Speech Recognition
Li, Xiangang
Wu, Xihong
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3219 - 3223

← 1 2 3 4 5 →