Deep Long Short-Term Memory Networks for Speech Recognition

被引:0
|
作者
Chien, Jen-Tzung [1 ]
Misbullah, Alim [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
关键词
speech recognition; acoustic modeling; hybrid neural network; long short-term memory;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Speech recognition has been significantly improved by applying acoustic models based on deep neural network which could be realized as the feedforward NN (FNN) or the recurrent NN (RNN). In general, FNN is feasible to project the observations onto a deep invariant feature space while RNN is beneficial to capture the temporal information in a sequential data for speech recognition. RNN based on long short-term memory (LSTM) is capable of storing inputs over a long time period and thus exploiting a self-learned mechanism for long-range temporal context. Considering the complimentary FNN and RNN in their modeling capabilities, this paper presents a deep model which is constructed by stacking LSTM and FNN. Through the cascade of LSTM cells and fully-connected feedforward units, we explore the temporal patterns and summarize the long history of previous inputs in a deep learning machine. The experiments on 3rd CHiME challenge and Aurora-4 show that the stacks of hybrid model with FNN post-processor outperform stand-alone FNN and LSTM and the other hybrid models for robust speech recognition.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] DEEP LONG SHORT-TERM MEMORY ADAPTIVE BEAMFORMING NETWORKS FOR MULTICHANNEL ROBUST SPEECH RECOGNITION
    Meng, Zhong
    Watanabe, Shinji
    Hershey, John R.
    Erdogan, Hakan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 271 - 275
  • [2] Long Short-Term Memory Networks for Noise Robust Speech Recognition
    Woellmer, Martin
    Sun, Yang
    Eyben, Florian
    Schuller, Bjoern
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2966 - 2969
  • [3] Multilingual Convolutional, Long Short-Term Memory, Deep Neural Networks for Low Resource Speech Recognition
    Bukhari, Danish
    Wang, Yutian
    Wang, Hui
    [J]. ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 107 : 842 - 847
  • [4] Long Short-term Memory for Tibetan Speech Recognition
    Wang, Weizhe
    Chen, Ziyan
    Yang, Hongwu
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1059 - 1063
  • [5] CONSTRUCTING LONG SHORT-TERM MEMORY BASED DEEP RECURRENT NEURAL NETWORKS FOR LARGE VOCABULARY SPEECH RECOGNITION
    Li, Xianggang
    Wu, Xihong
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4520 - 4524
  • [6] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Parcollet, Titouan
    Morchid, Mohamed
    Linares, Georges
    De Mori, Renato
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
  • [7] Modeling Speaker Variability Using Long Short-Term Memory Networks for Speech Recognition
    Li, Xiangang
    Wu, Xihong
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1086 - 1090
  • [8] Endpoint Detection using Grid Long Short-Term Memory Networks for Streaming Speech Recognition
    Chang, Shuo-Yiin
    Li, Bo
    Sainath, Tara N.
    Simko, Gabor
    Parada, Carolina
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3812 - 3816
  • [9] HIGHWAY LONG SHORT-TERM MEMORY RNNS FOR DISTANT SPEECH RECOGNITION
    Zhang, Yu
    Chen, Guoguo
    Yu, Dong
    Yao, Kaisheng
    Khudanpur, Sanjeev
    Glass, James
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5755 - 5759
  • [10] Optical Music Recognition by Long Short-Term Memory Networks
    Baro, Arnau
    Riba, Pau
    Calvo-Zaragoza, Jorge
    Fornes, Alicia
    [J]. GRAPHICS RECOGNITION: CURRENT TRENDS AND EVOLUTIONS, GREC 2017, 2018, 11009 : 81 - 95