Speech Dereverberation Using Long Short-Term Memory

被引:0
|
作者
Mimura, Masato [1 ]
Sakai, Shinsuke [1 ]
Kawahara, Tatsuya [1 ]
机构
[1] Kyoto Univ, Acad Ctr Comp & Media Studies, Sakyo Ku, Kyoto 6068501, Japan
关键词
Speech Dereverberation; Long Short-Term Memory (LSTM); Deep Autoencoder (DAE); NEURAL-NETWORKS; RECOGNITION; ALGORITHM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, neural networks have been used for not only phone recognition but also denoising and dereverberation. However, the conventional denoising deep autoencoder (DAE) based on the feed-forward structure is not capable of handling very long speech frames of reverberation. LSTM can be effectively trained to reduce the average error between the enhanced signal and the original clean signal by considering the effect of the long past time frames. In this paper, we demonstrate that considering as long as the maximum reverberation time of the database is effective. Since the effect of reverberation varies depending on the phone-class of the whole speech context, we augment the input of the autoencoder with the phone-class information of the past frames as well as the current frame and call this version of the LSTM autoencoder pLSTM. In the speech recognition experiment using the data set of Reverb Challenge 2014, the LSTM front-end reduced the WER of the multi condition DNN-HMM by 14.5%, and the use of the phone class feature yielded in pLSTM further improvement of 7.5%. The performance with the pLSTM is comparable to that of pDAE, while the number of parameters is only 1/25-1/8.
引用
收藏
页码:2435 / 2439
页数:5
相关论文
共 50 条
  • [31] Detecting Overlapping Speech with Long Short-Term Memory Recurrent Neural Networks
    Geiger, Juergen T.
    Eyben, Florian
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1667 - 1671
  • [32] Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition
    Oruh, Jane
    Viriri, Serestina
    Adegun, Adekanmi
    [J]. IEEE ACCESS, 2022, 10 : 30069 - 30079
  • [33] Short-term Load Forecasting with Distributed Long Short-Term Memory
    Dong, Yi
    Chen, Yang
    Zhao, Xingyu
    Huang, Xiaowei
    [J]. 2023 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE, ISGT, 2023,
  • [34] A short-term prediction model of global ionospheric VTEC based on the combination of long short-term memory and convolutional long short-term memory
    Peng Chen
    Rong Wang
    Yibin Yao
    Hao Chen
    Zhihao Wang
    Zhiyuan An
    [J]. Journal of Geodesy, 2023, 97
  • [35] A short-term prediction model of global ionospheric VTEC based on the combination of long short-term memory and convolutional long short-term memory
    Chen, Peng
    Wang, Rong
    Yao, Yibin
    Chen, Hao
    Wang, Zhihao
    An, Zhiyuan
    [J]. JOURNAL OF GEODESY, 2023, 97 (05)
  • [36] QUANTUM LONG SHORT-TERM MEMORY
    Chen, Samuel Yen-Chi
    Yoo, Shinjae
    Fang, Yao-Lung L.
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8622 - 8626
  • [37] LIPREADING WITH LONG SHORT-TERM MEMORY
    Wand, Michael
    Koutnik, Jan
    Schmidhuber, Jurgen
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6115 - 6119
  • [38] Associative Long Short-Term Memory
    Danihelka, Ivo
    Wayne, Greg
    Uria, Benigno
    Kalchbrenner, Nal
    Graves, Alex
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [39] Build A Module for Improvement Real Time Speech enhancement using Long Short-term Memory Approach
    Van Vo
    Bach Le Son
    Huy Vo Phuc
    [J]. PROCEEDINGS OF 2023 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2023, 2023, : 259 - 264
  • [40] IMPROVING LONG SHORT-TERM MEMORY NETWORKS USING MAXOUT UNITS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Li, Xiangang
    Wu, Xinhong
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4600 - 4604