SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS

被引：0

作者：

Graves, Alex ^{[1
]}

Mohamed, Abdel-rahman ^{[1
]}

Hinton, Geoffrey ^{[1
]}

机构：

[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 1A1, Canada

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

recurrent neural networks; deep neural networks; speech recognition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end training methods such as Connectionist Temporal Classification make it possible to train RNNs for sequence labelling problems where the input-output alignment is unknown. The combination of these methods with the Long Short-term Memory RNN architecture has proved particularly fruitful, delivering state-of-the-art results in cursive handwriting recognition. However RNN performance in speech recognition has so far been disappointing, with better results returned by deep feedforward networks. This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. When trained end-to-end with suitable regularisation, we find that deep Long Short-term Memory RNNs achieve a test set error of 17.7% on the TIMIT phoneme recognition benchmark, which to our knowledge is the best recorded score.

引用

页码：6645 / 6649

页数：5

共 50 条

[1] RECURRENT DEEP NEURAL NETWORKS FOR ROBUST SPEECH RECOGNITION
Weng, Chao
Yu, Dong
Watanabe, Shinji
Juang, Biing-Hwang
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[2] Comparative Analysis of Deep Recurrent Neural Networks for Speech Recognition
Atosha, Pascal Bahavu
Ozbilge, Emre
Kirsal, Yonal
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
[3] RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
VERDEJO, JED
HERREROS, AP
LUNA, JCS
ORTUZAR, MCB
AYUSO, AR
LECTURE NOTES IN COMPUTER SCIENCE, 1991, 540 : 361 - 369
[4] Audio Visual Speech Recognition Using Deep Recurrent Neural Networks
Thanda, Abhinav
Venkatesan, Shankar M.
MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, MPRSS 2016, 2017, 10183 : 98 - 109
[5] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
Tkachenko, Maxim
Yamshinin, Alexander
Lyubimov, Nikolay
Kotov, Mikhail
Nastasenko, Marina
SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
[6] SPEECH RECOGNITION WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
CHEN, WY
LIAO, YF
CHEN, SH
PATTERN RECOGNITION, 1995, 28 (06) : 795 - 805
[7] Visual speech recognition by recurrent neural networks
Rabi, G
Lu, SW
JOURNAL OF ELECTRONIC IMAGING, 1998, 7 (01) : 61 - 69
[8] Visual speech recognition by recurrent neural networks
Rabi, G
Lu, SW
1997 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS I AND II: ENGINEERING INNOVATION: VOYAGE OF DISCOVERY, 1997, : 55 - 58
[9] Unfolded Recurrent Neural Networks for Speech Recognition
Saon, George
Soltau, Hagen
Emami, Ahmad
Picheny, Michael
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 343 - 347
[10] Speech recognition with hierarchical recurrent neural networks
Natl Chiao Tung Univ, Hsinchu, Taiwan
Pattern Recognit, 6 (795-805):

← 1 2 3 4 5 →