MULTIFRAME DEEP NEURAL NETWORKS FOR ACOUSTIC MODELING

被引：0

作者：

Vanhoucke, Vincent ^{[1
]}

Devin, Matthieu ^{[1
]}

Heigold, Georg ^{[1
]}

机构：

[1] Google Inc, Mountain View, CA 94043 USA

来源：

2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年

关键词：

deep neural networks; acoustic modeling;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep neural networks have been shown to perform very well as acoustic models for automatic speech recognition. Compared to Gaussian mixtures however, they tend to be very expensive computationally, making them challenging to use in real-time applications. One key advantage of such neural networks is their ability to learn from very long observation windows going up to 400 ms. Given this very long temporal context, it is tempting to wonder whether one can run neural networks at a lower frame rate than the typical 10 ms, and whether there might be computational benefits to doing so. This paper describes a method of tying the neural network parameters over time which achieves comparable performance to the typical frame-synchronous model, while achieving up to a 4X reduction in the computational cost of the neural network activations.

引用

页码：7582 / 7585

页数：4

共 50 条

[1] Deep Neural Networks for Acoustic Modeling in Speech Recognition
Hinton, Geoffrey
Deng, Li
Yu, Dong
Dahl, George E.
Mohamed, Abdel-rahman
Jaitly, Navdeep
Senior, Andrew
Vanhoucke, Vincent
Patrick Nguyen
Sainath, Tara N.
Kingsbury, Brian
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 82 - 97
[2] Deep Neural Networks for Acoustic Modeling in the Presence of Noise
Santana, L. M. Q. D.
Santos, R. M.
Matos, L. N.
Macedo, H. T.
[J]. IEEE LATIN AMERICA TRANSACTIONS, 2018, 16 (03) : 918 - 925
[3] Modular Combination of Deep Neural Networks for Acoustic Modeling
Gehring, Jonas
Lee, Wonkyum
Kilgour, Kevin
Lane, Ian
Miao, Yaije
Waibel, Alex
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 94 - 98
[4] Distinct Triphone Acoustic Modeling Using Deep Neural Networks
Chen, Dongpeng
Mak, Brian
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2645 - 2649
[5] DEEP CONVOLUTIONAL NEURAL NETWORKS FOR ACOUSTIC MODELING IN LOW RESOURCE LANGUAGES
Chan, William
Lane, Ian
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 2056 - 2060
[6] Acoustic Modeling with Deep Neural Networks Using Raw Time Signal for LVCSR
Tueske, Zoltan
Golik, Pavel
Schluter, Ralf
Ney, Hermann
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 890 - 894
[7] Deep Neural Networks for Syllable based Acoustic Modeling in Chinese Speech Recognition
Li, Xiangang
Hong, Caifu
Yang, Yuning
Wu, Xihong
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[8] Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling
Yi, Yuan-Hao
Ai, Yang
Ling, Zhen-Hua
Dai, Li-Rong
[J]. INTERSPEECH 2019, 2019, : 2593 - 2597
[9] Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling
Kipyatkova, Irina
[J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 291 - 300
[10] Acoustic Modeling Using Deep Belief Networks
Mohamed, Abdel-rahman
Dahl, George E.
Hinton, Geoffrey
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (01): : 14 - 22

← 1 2 3 4 5 →