SPEECH ENHANCEMENT USING LONG SHORT-TERM MEMORY BASED RECURRENT NEURAL NETWORKS FOR NOISE ROBUST SPEAKER VERIFICATION

被引:0
|
作者
Kolbaek, Morten [1 ]
Tan, Zheng-Hua [1 ]
Jensen, Jesper [1 ]
机构
[1] Aalborg Univ, Dept Elect Syst, Fredrik Bajers Vej 7, DK-9220 Aalborg, Denmark
基金
欧盟地平线“2020”;
关键词
Speaker Verification; Long Short-Term Memory; Deep Neural Networks; Speech Enhancement; RSR2015; RECOGNITION; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose to use a state-of-the-art Deep Recurrent Neural Network (DRNN) based Speech Enhancement (SE) algorithm for noise robust Speaker Verification (SV). Specifically, we study the performance of an i-vector based SV system, when tested in noisy conditions using a DRNN based SE front-end utilizing a Long Short-Term Memory (LSTM) architecture. We make comparisons to systems using a Non-negative Matrix Factorization (NMF) based front-end, and a Short-Time Spectral Amplitude Minimum Mean Square Error (STSA-MMSE) based front-end, respectively. We show in simulation experiments that a male-speaker and text-independent DRNN based SE front-end, without specific a priori knowledge about the noise type outperforms a text, noise type and speaker dependent NMF based front-end as well as a STSA-MMSE based front-end in terms of Equal Error Rates for a large range of noise types and signal to noise ratios on the RSR2015 speech corpus.
引用
收藏
页码:305 / 311
页数:7
相关论文
共 50 条
  • [1] On Speaker Adaptation of Long Short-Term Memory Recurrent Neural Networks
    Miao, Yajie
    Metze, Florian
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1101 - 1105
  • [2] Long Short-Term Memory Networks for Noise Robust Speech Recognition
    Woellmer, Martin
    Sun, Yang
    Eyben, Florian
    Schuller, Bjoern
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2966 - 2969
  • [3] Robust Speech Recognition using Long Short-Term Memory Recurrent Neural Networks for Hybrid Acoustic Modelling
    Geiger, Juergen T.
    Zhang, Zixing
    Weninger, Felix
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 631 - 635
  • [4] LOMBARD SPEECH SYNTHESIS USING LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS
    Bollepalli, Bajibabu
    Airaksinen, Manu
    Alku, Paavo
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5505 - 5509
  • [5] Detecting Overlapping Speech with Long Short-Term Memory Recurrent Neural Networks
    Geiger, Juergen T.
    Eyben, Florian
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1667 - 1671
  • [6] Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks
    Chen, Zhuo
    Watanabe, Shinji
    Erdogan, Hakan
    Hershey, John R.
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3274 - 3278
  • [7] Session Based Recommendations Using Recurrent Neural Networks - Long Short-Term Memory
    Dobrovolny, Michal
    Selamat, Ali
    Krejcar, Ondrej
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 53 - 65
  • [8] Modeling Speaker Variability Using Long Short-Term Memory Networks for Speech Recognition
    Li, Xiangang
    Wu, Xihong
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1086 - 1090
  • [9] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Parcollet, Titouan
    Morchid, Mohamed
    Linares, Georges
    De Mori, Renato
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
  • [10] Long Short-Term Memory based Convolutional Recurrent Neural Networks for Large Vocabulary Speech Recognition
    Li, Xiangang
    Wu, Xihong
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3219 - 3223