SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS

被引:0
|
作者
Graves, Alex [1 ]
Mohamed, Abdel-rahman [1 ]
Hinton, Geoffrey [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 1A1, Canada
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
recurrent neural networks; deep neural networks; speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output alignment is unknown. The combination of these methods with the Long Short-term Memory RNN architecture has proved particularly fruitful, delivering state-of-the-art results in cursive handwriting recognition. However RNN performance in speech recognition has so far been disappointing, with better results returned by deep feedforward networks. This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. When trained end-to-end with suitable regularisation, we find that deep Long Short-term Memory RNNs achieve a test set error of 17.7% on the TIMIT phoneme recognition benchmark, which to our knowledge is the best recorded score.
引用
收藏
页码:6645 / 6649
页数:5
相关论文
共 50 条
  • [1] RECURRENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Weng, Chao
    Yu, Dong
    Watanabe, Shinji
    Juang, Biing-Hwang
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] Comparative Analysis of Deep Recurrent Neural Networks for Speech Recognition
    Atosha, Pascal Bahavu
    Ozbilge, Emre
    Kirsal, Yonal
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [3] RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    VERDEJO, JED
    HERREROS, AP
    LUNA, JCS
    ORTUZAR, MCB
    AYUSO, AR
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 540 : 361 - 369
  • [4] Audio Visual Speech Recognition Using Deep Recurrent Neural Networks
    Thanda, Abhinav
    Venkatesan, Shankar M.
    MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, MPRSS 2016, 2017, 10183 : 98 - 109
  • [5] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
    Tkachenko, Maxim
    Yamshinin, Alexander
    Lyubimov, Nikolay
    Kotov, Mikhail
    Nastasenko, Marina
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
  • [6] SPEECH RECOGNITION WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
    CHEN, WY
    LIAO, YF
    CHEN, SH
    PATTERN RECOGNITION, 1995, 28 (06) : 795 - 805
  • [7] Visual speech recognition by recurrent neural networks
    Rabi, G
    Lu, SW
    JOURNAL OF ELECTRONIC IMAGING, 1998, 7 (01) : 61 - 69
  • [8] Visual speech recognition by recurrent neural networks
    Rabi, G
    Lu, SW
    1997 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS I AND II: ENGINEERING INNOVATION: VOYAGE OF DISCOVERY, 1997, : 55 - 58
  • [9] Unfolded Recurrent Neural Networks for Speech Recognition
    Saon, George
    Soltau, Hagen
    Emami, Ahmad
    Picheny, Michael
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 343 - 347
  • [10] Speech recognition with hierarchical recurrent neural networks
    Natl Chiao Tung Univ, Hsinchu, Taiwan
    Pattern Recognit, 6 (795-805):